Objective: Use Perl or Python (strongly prefer Python) to compare similarities between, both orthographic similarity and phonetic similarity. (This is also a replication of previous academic study in drug names). Actual programming is not hard at all. You might need to take a little bit time to figure out each measure for orthographic similarity and phonetic similarity in Python or Perl.
Instructions:
I have 2 csv, tab delimited datasets: TreatName and ControlName. There are 4 columns in each dataset. The first column contains the ID of the drug, the second column represents the year of the production, the third column is date, and the fourth column indicates drug names. You will not be given the 2 datasets, but exerts from the 2 datasets are shown as below to give you a better idea of how they look like. You can simulate your own datasets if necessary. In fact, each dataset contains about 800 million row observations. Therefore, it is crucial to make sure your code is cohesive and efficient.
Dataset 1: TreatName Dataset 2: ControlName
ID Year Date DrugName ID Year Date DrugName
510001 2001 20010101 Axnieo Dex 16322 1996 19961111 Olexiny
510002 2001 20010630 Deliow 16358 1999 19991012 Weiliny
82468 1999 19990208 Tyleno.A 47829 2001 20010201 Delexiny.2
98465 1999 19991112 Plownix 78966 2001 20010911 Rexineo Celio
Following the attached pdf named “drug name”, you will write a program to measure both the orthographic similarity and phonetic similarity between two drug names between TreatName and ControlName in a given year. All measures are listed in the pdf “drug name”. You may need to I believe, Python and Perl have most of the functions built in already. For each measure, please document the Python or Perl function used for it.
In the end, I want you to produce a csv dataset named Targe, containing results of orthographic similarity and phonetic similarity between two drug names. This data set . See attached WORD file for a sample snapshot of Target.
Basically, for each year, you compare all drug names in dataset TreatName to all drug names in dataset ControlName for that given year.
Please let me know if you have any questions.
Hi! I am well experienced in python programming. I suppose I will be able to accomplish a task you need. I can deliver you high quality and efficient code in reasonable time.
$80 USD em 3 dias
5,0 (228 avaliações)
6,5
6,5
2 freelancers estão ofertando em média $165 USD for esse trabalho
Hello, your project is very well explained and clear. I'm a Python developer.
I will develop software to monitor and compare CSV for you, we will talk about all the details.
a greeting.