BIG DATA & ANALYTICS

BIG DATA & ANALYTICS & AI
Overview
- 10 years of experience
- 50 projects completed
- 400 data Engineers
- Edge, on-premise and cloud-based solutions
Service
- Data warehouse and data mining design and implementation
- Collect and analyze data in real-time
- Data visualization
- Standard and custom reports
- Data analytics and forecasting
- Analyzing high volume structured and unstructured data
- Data migration
Our Capacities

Technologies
Data Warehousing
- SSIS, SSRS, AWS Glue, AWS SQS, AWS DMS, AWS Kinesis Stream, AWS Kinesis Firehose, AWS EMR
- Hbase, HDFS
- MongoDB, CouchDB, Cassandra, Teradeta, AWS RedShift, AWS Aurora, AWS Athena, AWS Dynamo Database
BI and Data Visualization
- Tableau, Splunk, Pentaho, QuickSight, PowerBI, PowerApp, Cognos, Jasper, Qlikview, Pentaho BI, Power View, Datazen
Programming
- Python, R, Java, Hive, Scala, SAS, SPSS
Frameworks
- Hadoop, Spark
- Kafka, Storm, Sqoop, Flume
- Mahout, Drill, Solr
- Druid, SnappyData, Cassandra
- Map Reduce, Pentaho
Predictive analytics
- Regression
- Classification
- Clustering
- Time Series
Skill Set
AI Machine Learning - Skill Set
-
Machine Learning
- Descriptive Statistics
- Deep learning: TensorFlow, Keras, Yolo
- Model Optimization: TensorRT, OpenVINO
-
Reinforcement Learning
- Markov decision process
- Q-Learning and Deep Q-Networks
- Supervised Learning: Linear Regression, Neural Networks, Support Vector Machines
-
Unsupervised Learning
- K-Means Clustering
- Anomaly Detection
- Principal component analysis (PCA)
- Latent Dirichlet allocation (LDA)
- MLaaS: SAS, Google Cloud AI, Microsoft Azure AI, AWS AI
- Computer Vision: Image/video analytics
- Object Detection: Products, People, Vehicles
- NLP/OCR
- Category Theory

AI & Machine Learning - Sample Tools
- AI: Scikit Learn, TensorFlow, Caffe, MxNet, Keras, PyTorch, Theano, OpenVINO, Tensorrt, Gym, OpenCV, Pillow, Rawpy, Scikit-image
- NLP: Gensim, Underthesea, NLTK, Hugging Face, Spacy
- OCR: Tesseract, Google Vision, AWS Textract
- ML Pipeline: Ạmazon Sagemaker, Apache Airflow
- Infrastructure: Docker, Kafka, RabbitMQ, Kubernetes
Sample Projects
Customer Data Platform
This solution is made for one of top clothing retailers in Vietnam.
We have built a customer data platform to support customer on:
- Data Consolidation: collects and unifies first-party customer data from multiple sources to build a single, coherent, complete view of each customer (360-degree view of the customer)
- Platform Integrations: connects the database, marketing automation platform, and sales automation tool to improve the customer journey
- Execution & Reporting: analyzes and reports on consumer segmentation and marketing strategy across different channel then improving revenue and customer experience

Healthcare & Insurance Data Platform
This solution is made for American health insurance company, the third largest health insurance provider in the nation.
We have built an advanced data platform to support customer on:
- Data Set: collects information from all these disparate sources and creates one unified set of data that serves as a single source of truth for all parties involved
- Data Security: use Azure platform to store and keep data more secure
- Data Visualization: visualize data from various sources and share the results across all parties

Phishing Simulation Training Platform
The solution is made for an Australian-based company which provides security awareness and phishing simulation training solutions.
We have developed solution to support customer on:
- Tracking behavior: Reduce the likelihood of data spills or phishing fallout impacting organization by tracking and detecting user risky behaviors on the phishing emails
- Recommendation: Recommend the suitable courses to gain security skills and awareness to make safer decisions and fortify cyber security from the inside out

Data Warehousing Recruitment System
This solution is made for a leading recruitment agency in Australia that help organizations recruit the ideal staff and people find rewarding jobs.
We have supported clients manage data more efficiently and gain a competitive advantage:
- Data Centralization: gather data stored across various sites and move it to a unified environment for immediate access and analysis
- Data Ingestion: create data pipeline on AWS to ingest data from different data sources, transform various types of data into predefined formats and then deliver it to a data warehouse
- Automation: automates some of the tasks that previously had to be manually carried out with data ingestion technology

Data Warehouse For Product Lifecycle Management
This solution is made for an American enterprise in database management and security.
We have developed solution to support customer on:
- Integrate data from various systems
- Provide an integrated solution: to loading available data feeds from Operator and ERP Systems as billing, sale, inventory
- Continuous tracking system: Capture devices shipment, shipment attribute, device sales, sale incentive, devices associated with activate service, usage volume
- Integrate with QlikView Server: to provide executive reports as Sale Volume, in store product availability, Forecasting Accuracy

Migration Data Warehouse From On-Premise To AWS Cloud
This solution is made for an Australian company in online education training, their knowledge system is collected from many online teachers and uploaded to the cloud.
We have developed solution to support customer on:
- Migration data from On-Premise to AWS Cloud
- Design data warehouse on Cloud using AWS services
- Design a new Metadata for data warehouse on Cloud
- Implement CI/CD using Bitbucket Pipeline and Cloud Formation
- Integrate Power BI for data warehouse on cloud
- Integrate Power App for data warehouse on cloud

Let contact our BI, big data and analytics team to discuss solution for your needs