- A minimum of 2 years of relevant industry experience is required.
- This role required entity extraction from PDFs.
- Must be well versed with architecture and different layers of PDFs and must be able to decompose these layers.
- Must be able to identify font information, color schema, handwritten text, images, tables, table of content, header footer, and so on from PDF using Python.
- Must have hands-on experience in working with different Python libraries like Pdfminer, Pytesseract, pypdf, pdfbox, etc.
- At least 2 years of industry experience in Python programming.
- BE/ B tech. / MCA / MSc. / BSc
- Proven experience in building an application using Python, OCR, etc
- Contribute to various development projects.
- Excellent English communication skills both written and verbal
- Ability to self-learn
- Confident and friendly
- Critical thinking, logical analysis, and ability to work independently, prioritize and take initiative