We are hiring a Data Science Practitioner for a 100% onsite role in Woodlawn, MD. The selected resource must have a clearable background and be eligible to work on W2. No third party candidates accepted.
Role Description:
• Masters +2 years of experience or 6 years relevant work experience
• Excellent oral and written communication skills
• Formulate and rapidly prototype various approaches as well as effectively communicate the pros and cons of each.
• Excellent time management
• Ability to contribute to a high-performing, motivated workgroup by applying interpersonal and collaboration skills to achieve project goals
• Provide technical guidance in the fields of NLP, Machine Learning, Statistical Methods
• Provide data-driven approaches to tackle various business and NLP problems
• Ability to contribute to the creation of an environment that motivates individuals to work collaboratively as a team
Requires proficiency in:
• Python (required)
• Regular Expressions
• SQL (PostgreSQL)
• No-SQL (MongoDB)
• Version control systems (Git)
• Experience with ML frameworks: Tensorflow, PyTorch, Transfomers, Scikit-learn XGBoost, LSTM, Keras, Pandas, BERT, CNN, RNN, SVMs, k-Nearest Neighbors, Linear/Logistic Regression and Classification, Ensemble Methods, Graphical Models, Clustering, Tesseract
• Information Extraction
• Statistical model building (particularly classification)
• Ability to draw insights from sparsely labeled textual data.
• Ability to leverage domain knowledge as well as ontologies to improve model performance
Strong understanding of statistical modeling and experimental design including:
• when to value precision vs recall
• bias/variance tradeoffs
• handling sampling issues
• ability to improve performance on noisy data (both textual and numeric)
Knowledge of and experience using various NLP approaches, particularly:
• Pattern recognition/feature extraction
• Supervised, Unsupervised, and Semi-Supervised learning techniques
• Understanding of various language models (N-Gram, Skipgram, NLM, etc.)
• Practical experience leveraging open source libraries for emerging DNN approaches to NLP (transformers, BERT, RoBERTA, etc.)
• Chunking/Tokenization
• Semantic parsing
The following skills are not required but are highly desired:
• Experience with NLP technologies
• Experience with machine learning
• Web Service technologies such as SOAP, WSDL, WS-Security, MTOM, SWA
• Relational Databases such as DB2, Oracle, MySQL, SQL, JDBC
• NoSQL databases such as MongoDB and HBase
• Hadoop, Spark, HDFS, MapReduce, YARN, Scala, MapReduce, Pyspark
• XML processing experience such as XSD, XPath, XSL, XSLT, etc.
• ebXML
• IBM MQ Series