0

I have 2 functions, called GetItemIDlist (to scrape all idproduct of a shop on ecommerce site) and GetID_image (to get image gallary (about 8 images per product item) and store locally)

Both of them works smoothly when run individually.

But I do not know how to use the output of function 1, which gives a list of product IDs shown below, as an input for the looping process in function 2.

Because Function 2, currently, has just been able to get image gallary if giving it an specific product id.

So, I want to add a loop (using an output list generated by function 1) to get all image gallary of all product items, not by mannually keying an item id into function 2.

Output of function 1:

1242118776
1379832161
2055592163
bla bla
1230767270

Function 1's script:

def GetItemIDList (shopid):
    i=0
    while i<20:
        headers = {
            'User-Agent': 'Mozilla/5',
            'Referer': 'myrefererurl'
        }
        url = 'myAPIurl_pre'+str(shopid)+'myAPIurl_end'  
        r = requests.get(url, headers = headers, timeout= 5).json()
        for item in r['items']:
            itemid_list=item['itemid']
            print(itemid_list)
        i=i+1

Function 2 script:

def GetID_Image(item_id):
    headers = {
        'User-Agent': 'Mozilla/5',
        'Referer': 'myrefererheader'
    }   
    url = 'pre_myurl'+item_id+'end_myurl'
    r = requests.get(url, headers = headers, timeout= 5).json()
    itemid_shop=r['item']['itemid']
    itemname_shop=r['item']['name']
    print(itemname_shop)
    itemimage_shop=r['item']['images']
    endtag_image=range (11) #range(len(list(itemimage_shop))
    for imageid,i in zip(itemimage_shop,endtag_image):
        image_fronturl="myfronturl"
        image_fullurl=image_fronturl+imageid
        myfile = requests.get(image_fullurl, allow_redirects=True)
        open('mylocalfolder'+itemname_shop+'_'+str(i)+'.jpg', 'wb').write(myfile.content)

Pls take time to help me in this case, thanks so much!

  • 1
    Notice that your first function GetItemIDList does not return anything. So it can't feed your second function with inputs... – dallonsi Jun 7 at 9:25
  • 1
    Consider adding return itemid_list in your first function. – dallonsi Jun 7 at 9:25
  • Yes, I see. Thank you for your advice – Huynh Jun 7 at 12:45
1

first of all get_items returns list of available items id (20 items) in a specific shop (using shop_id)

def get_items(shop_id):
    item_list = []
    for _ in range(20):
        resp = requests.get(
            url='my_api_url_pre/{}/my_api_url_end'.format(shopid)
            headers={
                'User-Agent': 'Mozilla/5',
                'Referer': 'myrefererurl'
            },
            timeout= 5
        ).json()

        items = resp['items']
        for item in items:
            item_list.append(item['itemid'])

    return item_list

then save_images store all images available for an item (using item_id) locally.

def save_images(item_id):   
    resp = requests.get(
        url='pre_my_url/{}/end_my_url'.format(item_id),
        headers={
            'User-Agent': 'Mozilla/5',
            'Referer': 'myrefererheader'
        },
        timeout=5
    ).json()

    item = resp['item']
    shop_id, shop_name = item['itemid'], item['name']
    shop_images = item['images']

    for i, image_id in enumerate(shop_images):
        image = requests.get(
            url='my_front_url/{}'.format(image_id),
            allow_redirects=True
        )
        filename = '{my_local_folder}/{}_{}.jpg', shop_name, i)
        with open(filename, 'wb') as file
            file.write(image.content)

and finally try to loops over the list of items in a shop and store their images locally.

item_list = get_products(SOME_SHOP_ID)

for item_id in item_list:
    save_images(item_id)
  • That works perfectly. Thank you so much! – Huynh Jun 7 at 12:44
  • @Huynh You're welcome! Don't mention it! – mrzrm Jun 7 at 12:52

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.