This is really part 2 of a 2 part tutorial on scraping configurable product data, which began here. For those people not interested in the scraping aspect of this project, you can download the magento sample data here. Warning, this sample data might be NSFW if you have an uptight boss.
The first thing we're going to need to do is set up our categories. Open the products spreadsheet in Excel or something similar and Highlight the _category column (column L) and copy that. Bring that into a text editor and sort the values / remove duplicates. Each unique needs to be a category in Magento or you will get an import error. When you create the the categories, set them to 'Enabled' so they show up in your storefront. Drag all your categories into the "Default Category" root category after you make them.
Next we need to set up the attributes. Copy columns _super_attribute_code and _super_attribute_option (S and T) into your text editor and sort/remove duplicates. You'll see unique values for 2 attributes, color and size. These both need to be set up in Magento.
So go to Catalog / Attributes/ Manage Attributes, create a new color attribute if it doesn't exist and set:
Scope - Global
Catalog Input Type for Store Owner - Dropdown
Use To Create Configurable Product - Yes
Next click on Manage Label / Options and add your color options. Then do the same for size.
Now you're all set for a clean import. Head over to System - Import / Export - Import, select Products, and upload the spreadsheet. If you are using Magent Go you can import the Product Images in the same way. Otherwise you'll want to ftp those to your media folder.
Thursday, November 28, 2013
Wednesday, November 20, 2013
Scrape a Website to Magento Configurable Product Import Format
Today I'm going to show how to scrape store products and export them to Magento's import format and keep the configurable product options that are associated. Like most things that involve Magento, this required a lot of patience and trial and error.
The goal's of the project are:
Let's go over some of the code. First we instantiate our CSV object (yes, it's a global variable. I'm okay with that.) Then we load the listings page and iterate through each listing. Pretty self explanatory so far.
So now we pass the
The goal's of the project are:
- Learn how to scrape ecommerce data to Magento's configurable product import format
- Get some sexy Magento sample store data for use in future testing and mock-ups
Let's go over some of the code. First we instantiate our CSV object (yes, it's a global variable. I'm okay with that.) Then we load the listings page and iterate through each listing. Pretty self explanatory so far.
$csv = new CSV('products.csv', $fields, ",", null); // no utf-8 BOM // and start scraping $url = 'http://www.spicylingerie.com/'; $page = $browser->get($url); foreach($page->search('//div[@class="fp-pro-name"]/a') as $a){ scrape($a); echo '.'; }
So now we pass the
a elements that have the details page urls to our scrape function. Because we earlier did $browser->convertUrls = true we no longer need to worry about converting our relative hrefs to absolute urls. The library took care of that for us.
Now we get the page for the link and start building our $item
array which we will pass to the
save() function. Other than the ugly expression for description this was easy.
$url = $a->getAttribute('href');
$page = $browser->get($url);
$item = array();
$item['name'] = trim($a->nodeValue);
$item['description'] = $item['short_description'] = trim($page->at('//div[@class="pro-det-head"]/h4/text()[normalize-space()][position()=last()]')->nodeValue);
if(!preg_match('/Sale price: \$(\d+\.\d{2})/', $page->body, $m)) die('missing price!');
$item['price'] = $m[1];
if(!preg_match('/Style# : ([\w-]+)/', $page->body, $m)) die('missing sku!');
$item['sku'] = $m[1];
Next we save the image, for later import/upload - identify the categories we care about - and construct our items. The options need to look like:
$options = array(
array('size' => '12', 'color' => 'purple'),
array('size' => '10', 'color' => 'yellow')
);
Where the array keys are the attributes that you have made configurable product attributes (Global, Dropdown, Is used in Configurable Products)
That's all there is to it. I won't go into the save function because hopefully that one will just work for you.
Subscribe to:
Posts (Atom)