Tech Note: A Regular Expression to Extract the ASIN Product Code From Any Amazon URL

So much fun it must be ASIN
LGF • Views: 43,961
Image via Shutterstock

Every once in a while I come up with a bit of code that might be useful to other programmers, and here’s one of those bits — a regular expression that extracts the ASIN product code from almost any Amazon product URL, including URLs from non-US Amazon stores. (It doesn’t handle pages of search results or other index pages, however.)

I use this at LGF to rewrite links to Amazon and add our affiliate ID to the link, so if someone clicks the link we get a small percentage of the purchase price when they buy something.

This code used to be a series of regular expressions that matched each type of URL, but I was looking at it this week and realized it might be possible to condense the whole process down to just one regex. And after extensive testing, I think I may have gotten pretty close to a universal Amazon ASIN extractor. (I searched Google, and couldn’t find anything this good online.)

In this PHP code, I used the x modifier to let me split the regex across multiple lines with comments and indenting. I also used ~ for the regular expression delimiter so all the forward slashes don’t need to be escaped. These two steps make the code much more readable, but if you use this regex with a language that doesn’t allow you to change the delimiter you’ll need to insert a backslash before each forward slash.

(Notice that I used non-capturing groups for everything except the ASIN product ID.)

$regex = '~
	(?:www\.)?				# optionally starts with www.
	ama?zo?n\.				# also allow shortened amzn.com URLs
	(?:
		com					# match all Amazon domains
		|
		ca
		|
		co\.uk
		|
		co\.jp
		|
		de
		|
		fr
	)
	/
	(?:						# here comes the stuff before the ASIN
		exec/obidos/ASIN/	# the possible components of a URL
		|
		o/
		|
		gp/product/
		|
		(?:					# the dp/ format may contain a title
			(?:[^"\'/]*)/	# anything but a slash or quote
		)?					# optional
		dp/
		|					# if short format, nothing before ASIN
	)
	([A-Z0-9]{10})			# capture group $1 contains the ASIN
	(?:						# everything after the ASIN
		(?:/|\?|\#)			# beginning with /, ? or #
		(?:[^"\'\s]*)		# everything up to quote or white space
	)?						# optional
~isx';

Here’s a bit of code that puts this regex to work and inserts our affiliate ID into the reconstructed link:

$text = preg_replace($regex, 'www.amazon.com/dp/$1/?tag=littlegreenfo-20', $text);

And here’s what this regular expression looks like when condensed down to one gnarly line:

$regex = '~(?:www\.)?ama?zo?n\.(?:com|ca|co\.uk|co\.jp|de|fr)/(?:exec/obidos/ASIN/|o/|gp/product/|(?:(?:[^"\'/]*)/)?dp/|)(B[A-Z0-9]{9})(?:(?:/|\?|\#)(?:[^"\'\s]*))?~isx';

If you run this regular expression against the following URL:

http://www.amazon.com/Man-High-Castle/dp/B00RSGFRY8/ref=sr_1_1?s=instant-video&ie=UTF8&qid=1421879835&sr=1-1&keywords=the%20man%20in%20the%20high%20castle

You end up with this:

http://www.amazon.com/dp/B00RSGFRY8/?tag=littlegreenfo-20

Jump to top

Create a PageThis is the LGF Pages posting bookmarklet. To use it, drag this button to your browser's bookmark bar, and title it 'LGF Pages' (or whatever you like). Then browse to a site you want to post, select some text on the page to use for a quote, click the bookmarklet, and the Pages posting window will appear with the title, text, and any embedded video or audio files already filled in, ready to go.
Or... you can just click this button to open the Pages posting window right away.
Last updated: 2023-04-04 11:11 am PDT
LGF User's Guide RSS Feeds

Help support Little Green Footballs!

Subscribe now for ad-free access!Register and sign in to a free LGF account before subscribing, and your ad-free access will be automatically enabled.

Donate with
PayPal
Cash.app
Recent PagesClick to refresh
The Good Liars at Miami Trump Rally [VIDEO] Jason and Davram talk with Trump supporters about art, Mike Lindell, who is really president and more! SUPPORT US: herohero.co SEE THE GOOD LIARS LIVE!LOS ANGELES, CA squadup.com SUBSCRIBE TO OUR AUDIO PODCAST:Apple Podcasts: podcasts.apple.comSpotify: open.spotify.comJoin this channel to ...
teleskiguy
3 weeks ago
Views: 732 • Comments: 0 • Rating: 0