~dricottone/fmg-timesheets

ref: cab4c59713aeea180456717c05ba8f8d42e0597b fmg-timesheets/main.py -rw-r--r-- 1.0 KiB
Adding exporters

Wrote and tested the long CSV exporter. Stubbed out the JSON exporter.
Fully functional timesheet parser.

The timesheet parser is a complete success. Some minor issues were
ironed out in the XML parser as well.

Next steps: writing to a time series database and beginning analysis.
Goodbye HTML, hello XML

Replaced HTML exporting/parsing with XML exporting/parsing. Also
replaced the 'high-level' function call with 'low-level' pdfminer
usage.

The XML parser handled validation and suppression of header/footer
content on its own.

From the PDF parser, XML is dumped to a file. From the XML parser, CSV
is dumped to a file. The new timesheet parser should read in that CSV
file.
Significant updates

Wrote time sheet parser that ingests and validates all semi-structured
data. Next step is to interpret left styles as dates, so that hours can
be parsed into a time entry object.

Updated HTML parser to more completely filter out unhelpful data, and to
internally build the array of doubles (data and left style).