Capturing Training Data using Natural Language Processing

Open
Ubineer
Toronto, Ontario, Canada
CEO
(37)
4
Preferred learners
Canada
Academic experience
Categories
Data analysis Data modelling Machine learning Artificial intelligence Data science
Skills
data preprocessing chunking development environment natural language processing (nlp) data analysis
Project scope
What is the main goal for this project?

Ubineer is looking to add to our data set by integrating advanced Natural Language Processing (NLP) techniques. Currently we are seeking to build a massive data set that understands complex queries to capture data. The goal of this project is to expand/improve our data sets so that we can train a Large LLM. We are seeking student who want to understand how NLP works and have a passion in data analysis. The project will involve tasks such as data preprocessing, capturing, and storing. By the end of the project, the students would understand key NLP techniques such as chunking.

What tasks will learners need to complete to achieve the project goal?

The deliverables for this project include a 1-2 hour tutorial, weekly stand up meetings and if all goes well, some of the code you generate will land on our production environment.


At the end of the project students will be responsible for creating a 2-3 page document (report) describing what they learned and completed.


The report should have:

  1. Basic Description of project
  2. What was completed
  • Number of Companies,
  • Number of of files parsed
  • Number of data points captured
  • Number of text segement captured.
  • Speed per capture.




Supported causes
Industry, innovation and infrastructure
About the company
2 - 10 employees
Technology, Banking & finance

Ubineer is a leading AI financial technology company focused on delivering productivity to financial decision makers. Our goal is to be the primary source for knowledge management, collaboration and insights in the investment industry. Our mission is to help financial decision makers optimize the quality and speed of their investment lifecycle.