Build a Retail Scraper for the Apocalypse

Victory!

Hello from the retail apocalypse of 2020. If you're reading this in the future, this was an attempt to build a tool to make life a little easier during the Covid-19 / panic-buying pandemic. The problem: It's become difficult to find basic stuff like diapers, baby formula, toilet paper, flour, bleach, etc - every store seems to be out of stock.

So let's put down our shotguns and pick up our Python tutorials.

Here's a quick script I wrote to periodically check various retailers and see if certain items are back in stock and available for shipping. Tip from the TP Fairy: Obviously, don't use this script to hoard. Be kind and gentle, like double ply, and buy only what you need.

import urllib.request
import urllib.parse
import json
import time
import random
import winsound



# CONFIG --------------------------------------

#Replace these products with whatever.
PRODUCT_INFOS = [

        {
            "name"      :   "Target Up&Up 30 Mega Rolls",
            "url"       :   "https://redsky.target.com/v1/location_details/78603854?latitude=XXX&longitude=XXX&zip=XXXXX&state=XX&storeId=XXXX&scheduled_delivery_store_id=XXXX&fulfillment_test_mode=grocery_opu_team_member_test",
        },

        {
            "name"      :   "Charmin Ultra Soft 30 Mega Rolls",
            "url"       :   "https://redsky.target.com/v1/location_details/75663300?latitude=XXX&longitude=XXX&zip=XXXXX&state=XX&storeId=XXXX&scheduled_delivery_store_id=XXXX&fulfillment_test_mode=grocery_opu_team_member_test",",
        },

        {
            "name"      :   "Diapers Size 3 x 108",
            "url"       :   "https://redsky.target.com/v1/location_details/16313662?latitude=XXX&longitude=XXX&zip=XXXXX&state=XX&storeId=XXXX&scheduled_delivery_store_id=XXXX&fulfillment_test_mode=grocery_opu_team_member_test",",
        },

        {
            "name"      :   "Kirkland Signature Bath Tissue, 2-Ply, 425 sheets, 30 rolls",
            "url"       :   "https://www.costco.com/AjaxGetContractPrice?itemId=507802&catalogId=10701&productId=507801&WH=XXXXXX",
        },

    ]


HEADERS = {
        "User-Agent"        : "Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0",
        }


#----------------------------------------------



def PlaySound():
    winsound.PlaySound("available.wav", winsound.SND_FILENAME)




def CheckAvailability_Target(productInfo):

    print("Target")

    #Adjust headers.
    customHeader = HEADERS
    customHeader["Referer"] = "https://target.com/"
    customHeader["Accept"] = "application/json"

    #Assemble and submit the request.
    req = urllib.request.Request( productInfo["url"], headers=customHeader )
    response = urllib.request.urlopen( req ).read().decode()

    #Parse the json response into usable data.
    data = json.loads( response )

    #Check the field the indicates availability.
    status = data["product"]["ship_methods"]["availability_status"]
    if status == "UNAVAILABLE":
        return False
    elif status == "AVAILABLE":
        return True

    #Unknown.
    print("Unknown status.")
    return False



def CheckAvailability_Costco(productInfo):

    print("Costco")

    #Adjust headers.
    customHeader = HEADERS
    customHeader["Referer"] = "https://costco.com/"
    customHeader["Accept"] = "application/json"
    #customHeader["Cookie"] = ""

    #This seems to work only with 2-day-shipping products.
    #A product's page loads with initial inventory availability info set,
    #but the 2-day-shipping products override this info with a subsequent xhr.
    #Need to look into this a little more when I have time.
    #This currently works for Kirkland TP, though, so good enough for now.

    #Assemble and submit the request.
    req = urllib.request.Request( productInfo["url"], headers=customHeader )
    response = urllib.request.urlopen( req ).read().decode()

    #Parse the json response into usable data.
    data = json.loads( response )

    if "invAvailable" in data:
        status = data["invAvailable"]   #boolean    
        return status

    print("Unknown status.")
    return False



def CheckAvailability_Walmart(productInfo):

    print("Walmart (not yet implemented)")
    return False



def CheckAvailability(productInfo):

    # Call retailer-specific version of this function.

    if "target.com" in productInfo["url"]:
        return CheckAvailability_Target(productInfo)

    elif "costco.com" in productInfo["url"]:
        return CheckAvailability_Costco(productInfo)

    elif "walmart.com" in productInfo["url"]:
        return CheckAvailability_Walmart(productInfo)

    #Unknown retailer.
    print("Unknown retailer.")
    return False






# ----------------------------------------------

if __name__ == '__main__':


    while True:

        for productInfo in PRODUCT_INFOS:

            print("")
            print("Check: " + productInfo["name"])

            isAvailable = CheckAvailability(productInfo)

            if isAvailable:
                print("AVAILABLE AVAILABLE AVAILABLE AVAILABLE!!!!")
                PlaySound()

            else:
                print("Not available.")

            print("")

            #Sleep a short random amount of time, to look less like a robot.
            time.sleep( random.uniform(1.0, 3.0) )

        #Sleep for a few minutes before doing the whole thing again.
        time.sleep( 60 * 5 )

How to Use the Script

  1. Make sure Python 3 is installed.

  2. Copy and paste the above script into retailScraper.py.

  3. Save this .wav file into the same folder as the script. This fabulous sound will play when one of your products becomes available.

  4. Edit the config section of the script with the xhr urls and product names you want. See below for how to find these.

  5. Run it. The script will keep checking each of the products indefinitely, and play an alert when one of them becomes available.

Target.com URLs

Make sure you set your zip code with the menu at the top left. Find the product's page you want on Target.com. In Firefox, press ctrl+shift+k to open the development console. Keystroke might be different for your browser. Refresh the page.

Now you can see that upon loading a product, and each time upon selecting a "size" or "color" or whatever variation, a request will be sent to "redsky.target.com" to gather availability and other info. Copy this URL into the config section of the script.

Costco.com URLs

Similar to Target, make sure you set your zip code on the site. This will result in Costco sending a nearby warehouse ID with each request. In Firefox, press ctrl+shift+k to open the development console. Keystroke might be different for your browser. Refresh the page.

Look for an "xhr" request sent to "costco.com/Ajax blah blah...". Copy this URL into the config section of the script. Note that it includes your local warehouse id.

This seems to work for "2 day shipping" products on Costco only. I might investigate further for other types of products later.

Walmart.com URLs

Not yet implemented. Walmart appears to not actually list products that are out of stock in their grocery pickup section, so might need a different approach here.

Fun fact: Having trouble finding a pickup time slot available? Walmart refreshes them at midnight.

Happy retail scraping!

Update: Future generations will never appreciate what a soaring achievement this was.

< back
(c) Kyle Gabler. All Rights Reserved. Terms and Conditions: By using this site, you agree that everything that was once yours is now mine, in perpetuity, throughout the universe. Enjoy.