Find Jobs
Hire Freelancers

data crawler to login & spider inventory data from distributor website to csv file

$30-100 USD

Em Andamento
Publicado há mais de 18 anos

$30-100 USD

Pago na entrega
We need to create a automated crawler that will log into a distributor warehouse website and download inventory data from tables to a delimtered file. The website we will be crawling is the login/search catalog section of www.electrograph.com. I have saved copies of their site locally to demonstrate what needs to be done. After closing of project we will provide actual login details to the live site for the job to be completed. Walk through process of what needs to be done: Login Home Page [login to view URL] Goto main website and login using the form in the uppler left hand corner of page. User name and password should be definable. Successfully Logged In [login to view URL] After the login has been processed successfully the page is refreshed now including a "My Account" section in the upper left hand corner. Additionlly, The "keyword/ Item# search" form is now enabled for our specific account. It will display the specific pricing, and inventory quantities available for our account when submitted. Currently their web site allows you browse through the inventory of items by category, and then paginate through the results (cannot show all products in one iteration). We need to follow each category link through the select menu "ddlCategory" individually, download all the data in the page to specified format, and continue on to the next page of results if another page exists. Crawling first result page of the first category searched "Accessories" [login to view URL] This page displays the information that we are looking to store in a delimitered file format. We need to trim & store Model #, Manufacturer, Description, Availability, Reseller Price columns. Each table row, a new line in the delimtered file created. Take note of the Availability column, it provides a total quantity number in stock and then a "I" icon. When you hover above this "I" icon it displays the breakup of which warehouse locations that product is stored in. For example: 18 (I says: 14 - NY, 4, NV, meaning 14 units in stock in New York, 4 units in stock in Nevada). We need to store both the total quantity available as well as those individual location listings. A column for each warehouse location. Crawling second/additional result page(s) of the first category searched "Accessories" (page 2+) [login to view URL] Perform the same process as Step2 downloading & storing all the inventory data, and continue onto the next page if it exists. (Note on the saved version of the this page i povided you; the javascript is not working to show the individual warehouse splitup, it will of course be operating on the live site) Crawling first result page of additional LARGE category searched "Plasma Displays" [login to view URL] (interim refine page) [login to view URL] (actual results page) Some categories of their website that contain a substation amount of products, when you first click on "SEARCH" it does not display results. It brings you to another "search plasma displays" form where you can refine your results, and search by attributes. We do not care to do this, we simply want to select the "GO" button, which will display all the products under that category in the same manner as step2. Crawling second/additional result page(s) of additional LARGE category searched "Plasma Displays" [login to view URL] Perform the same process as Step2 downloading & storing all the inventory data, and continue onto the next page if it exists. The end result needs to create a file that is Delimitered by Comma Example result for parsing of example link [login to view URL] Model Number, Manufactuer, Description, Reseller Price, Total Available Qty, Location NY Qty, Location NV Qty, Location XX Qty ACE615, ADCOM, ACE-615 ILS SURGE (120V), 315.00, 12, 12, 0, 0 TRAVEL CS/42"PANASON, CALZONE CASE CO, TRAVEL CASE 42" PANASONIC, 345.33, 0, 0, 0, 0 FSD-4100, CHIEF MANUFACTURING, FSD-4100, 97.39, 0, 0, 0, 0 CMA-0608, CHIEF MANUFACTURING, 6'-8' ADJUSTABLE PLATE, 93.39, 0, 0, 0, 0 RC-1PXL, ELECTROGRAPH SYSTEMS, 24-BUTTON SWITCH PANEL FOR VS-1XL, 104.76, 0, 0, 0, 0 RC-1XL, ELECTROGRAPH SYSTEMS, NEW MODEL NUMBER (WAS VS-1XL) REMO, 104.76 0, 0, 0, 0 FRAME-O, ELECTROGRAPH SYSTEMS, SINGLE GANG FRAME TO HOLD UP TO 3 W, 245.35, 0, 0, 0, 0 FRAME-W, ELECTROGRAPH SYSTEMS, SINGLE GANG FRAME TO HOLD UP TO 3 W, 14.89, 5, 5, 0, 0 Notice on the website, some products it gives a quantity, some it says "call for availability". We need to be able to map whatever text is in that field to a text/numerical equivalent. For example in this impelentation we define "Call for availability" as 0. Also, because they are always adding and changing warehouse locations we need to leave room at the end of the delimitered file for new locations that are added. When text is found in the quantity available field, and we compare it to find its equivalency and apply that to all the other location columns. For example: "call for availabiilty" will result in 0, 0, 0, 0 (Total Quantity Available, Location 1 Qty, Location 2 Qty, Location 3 Qty). We should make room for up to 10 warehouse locations (0, 0, 0, 0, 0, 0, 0, 0, 0, 0). When a quantity is not defined for a warehouse that is indexed we will replace it with zero. In this example Call for availbility means the product is not in stock, thus we are marking it and all subsequent warehouse locations as 0. I also need to able to control the delimiter used in the output file (I have used comma in this illustration for ease). I also need to be able to control the delay between page navigation (milliseconds) A database should not be necessary; a simple config file is fine. Need to get this project completed ASAP. We have several data crawlers that need to be created: Winner of this project can expect future work in the development of similar crawlers.
ID do Projeto: 29093

Sobre o projeto

1 proposta
Projeto remoto
Ativo há 19 anos

Quer ganhar algum dinheiro?

Benefícios de ofertar no Freelancer

Defina seu orçamento e seu prazo
Seja pago pelo seu trabalho
Descreva sua proposta
É grátis para se inscrever e fazer ofertas em trabalhos
1 freelancers estão ofertando em média $95 USD for esse trabalho
Avatar do Usuário
I have developed site crawlers in past. These crawlers are able to handle Cookie based sessions, Javascript URLs and http/html redirects. I can use existing codebase to complete this project. This poject can be implemented with Java. With Java you can run it on your desktop and move it off to a server if you want to automate it in future. Please let me know if I can provide you more information.
$95 USD em 5 dias
5,0 (2 avaliações)
4,2
4,2

Sobre o cliente

Bandeira do(a) UNITED STATES
brooklyn, United States
5,0
19
Método de pagamento verificado
Membro desde jun. 23, 2005

Verificação do Cliente

Obrigado! Te enviamos um link por e-mail para que você possa reivindicar seu crédito gratuito.
Algo deu errado ao enviar seu e-mail. Por favor, tente novamente.
Usuários Registrados Total de Trabalhos Publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Carregando pré-visualização
Permissão concedida para Geolocalização.
Sua sessão expirou e você foi desconectado. Por favor, faça login novamente.