Going to this website: [login to view URL], we can enter in company tickers into the "Fast Search" menu to find financial information on the company. For example, we can type in "AAPL" to find information about Apple Inc. We can then filter filing type by "10-k" and open the most recent document by clicking "Documents" on the top result, then open the link for the first document listed. This opens an htm file of the 10-k filing. The filing is separated into sections, typically starting with "Item 1", then "Item 1A", "Item 1B", etc. I would like a python script which outputs each section's strings in some standard data format (ex pandas dataframe). This script must work for any company. The following companies should be used for basic testing: AAPL, BA, DIS, F, GE, GM, GOOG, GS, JPM, MMM, MU, NVDA, SQ, V, WMT, XOM.
Hello sir
I am a qualified python developer with 8 years of professional experience. I had made 2 crawler on edgar. The crawler had filtering function for 10-k too. I am confident with this project and can help you. I am ready to start the work.
Looking forward to hearing from you soon.
Best Regards,
Yongtao
Hello!
Your requirement seems pretty clear, I can create this 10-K parser of htm files with Python and proceed with testing with your list of companies in order to confirm that the sections are parsed correctly.
Thanassis
Hi,
As you can see I'm new at freelancer, But I possess excellent experience in Hybrid and Native application.
I have worked on the Platform like React Native, React JS, Node JS (4+ Years of experience in Java script).
I worked on trending features like Geo location , QR code Scanner , Image Recognition
I have read the work scope but I need some clarification or detailed requirement
So it would be great if I could have a word with you over Private chat
Thanks
Hello, I have briefly read the description on sec-parsing , and I can deliver as per the requirements however I need us to discuss for more clarity on the details, deadline and budget as well.
It's pretty clear what needs to be done, although I have a couple of questions.
1. Am I right to assume that you need only Section names, not the whole text of each section?
2. Is it OK to output using JSON? It's a pretty universal format and Pandas can read it without issues.