Education


University of Malaya

Master of Data Science degree with distinction (GPA: 3.98/4)

Kuala Lumpur, Malaysia

Feb 2018 - Oct 2019

  • Research project: built and deployed a machine-learning system to predict taxi-trip duration. Tools used include Python, Pandas, Matplotlib, Seaborn, Scikit-learn, XGBoost, Flask (for deployment), and Jupyter Notebook. Project code and report are available on http://bit.ly/ds-prj.
  • Relevant courses: Data Analytics, Data Mining, Machine Learning for Data Science, Programming for Data Science, Big Data Application and Analytics, Big Data Management.

Princess Sumaya University for Technology

B. Sc. degree in Computer Engineering; first rank (GPA: 93.9%)

Amman, Jordan

Sep 2012 - Sep 2016

  • Relevant courses: Data Structures and Introduction to Algorithms, Database Systems, Visual Programming, Discrete Mathematics, Computer Architecture and Organization, Operating Systems.
UM Certification UM Certification

Experience


Udacity

Mentor

California, United States

May 2020 - Present

  • Work as a mentor for "AI for Healthcare" specialization where I review student projects and answer their questions. The projects cover different machine-learning topics including image classification using convolutional neural networks, regression, exploratory data analysis, model evaluation, etc. The projects are usually done using Python, TensorFlow, Keras, Scikit-learn, Pandas, Matplotlib, etc. Link to the specialization page: https://bit.ly/ai-for-healthcare.

Upwork Inc.

Freelancer

California, United States

February 2020 - Present

  • Worked on different projects in the fields of data analysis, natural language processing, and data scraping. I have a 100% Job Success score on the platform and all of my ratings were 5/5.
  • In one project, I built for the client a Python program that runs multiple times a day to retrieve specific data from Instagram stories, process and enrich the data, and send it to an API endpoint. In another project, I used Python to aggregate messy data from multiple data sources into one organized data source. In a third project, I created a Python script to crawl data from YouTube using Selenium, process the data, and push it to a Google sheet. For more, please visit my profile: https://bit.ly/upwork-pr.

Apigate Sdn Bhd

Data Science and Analytics Intern

Kuala Lumpur, Malaysia

April 2019 - July 2019

  • Performed extensive data analysis on millions of data records stored in Google BigQuery to get useful business insights. Tools used include SQL, Python, Pandas, Google Data Studio, and Matplotlib.
  • Created an ETL Python script to transfer and sync company data from Salesforce to Google BigQuery. The program was scheduled to run weekly and used API services to access Salesforce and BigQuery data.
UM Certification

Skills


Programming Languages: Python, R, JavaScript.

Data Analysis and Machine Learning: Pandas, NumPy, Scikit-learn, TensorFlow/Keras (deep learning, CNNs), XGBoost, LightGBM, Aequitas (bias detection), etc. Also familiar with SAS and Hadoop ecosystem.

Data Visualization and Dashboards: Matplotlib, Seaborn, Google Data Studio, Follium for maps, etc.

Databases: SQL. Web Scraping: BeautifulSoup, Selenium, HTTrack.

NLP: NLTK. Web Development: Flask, Django, HTML, CSS, JavaScript, Vue.js.

Cloud Computing: Google Cloud Platform (BigQuery for big data, Compute Engine, and Storage), Amazon Web Services (EC2, S3, and Route 53).

More: Git, graphic design (Photoshop, Canva), video editing (Davinci Reolve).

Languages: English: fluent (TOEFL iBT: 102). Arabic: native.

UM Certification

Projects


In addition to the projects mentioned above, I've done the following projects:

Analysis of International Football Matches Between 1872 and 2018: Analyzed 40,000+ matches to answer interesting questions about the data using efficient visualizations. Used Python, Pandas, Matplotlib, Seaborn, etc. Link: http://bit.ly/ftanalysis.

Kaggle Competitions: Participated in many machine-learning competitions on Kaggle. Some are:

  • Help Navigate Robots: The goal was to detect the type of surface a robot is standing on using sensor data. Ranked in the top 5% among 1470+ competitors. Used Python, LightGBM for modeling, and Tsfresh package to extract features from the time-series data. Code and results: http://bit.ly/help-robo.
  • VSB Power Line Fault Detection: The goal was to detect whether power-line signals are faulty or not. Used Python, Pandas, NumPy, and SciPy. Models used include XGBoost, LightGBM, and Neural Network. Code and results: http://bit.ly/vsb-pl.

YouTube Trending Videos Analysis: Analyzed data of 40,000+ YouTube trending videos to identify common patterns and get insights (Python, Pandas, Seaborn, etc.). Code and results: http://bit.ly/YT-analysis.

Clustering and Comparing the Neighborhoods of New York City and Toronto: Clustering was based on the similarity between the venues in the neighborhoods. Foursquare API was utilized to retrieve venues data. Project page: http://bit.ly/clustnt.

Analysis of Stock Prices in Malaysia: Stock-prices data for 1800+ companies was crawled from many sources for 3 months; then multiple analyses were applied including sentiment analysis, stock-prices correlation, investment recommendation, and clustering. Github (with PDF report): http://bit.ly/dm-assign; video: http://bit.ly/dm-vid.

End-to-end data-science projects: Built a machine-learning system to predict house prices based on many characteristics like house size, construction year, etc. Project report and code: http://bit.ly/hp-pdf.

Educational YouTube Channel: Started a YouTube channel recently to teach computer science and AI in Arabic. Currently, it has 27 videos and 8,000+ views. Link: https://bit.ly/youtube-c.

Pair & Compare: A web application that makes it easier to compare fonts and font-pairs. All 800+ Google fonts can be used without downloading or installing any of them. It is built using HTML, CSS, JavaScript, Vue.js, etc. It can be visited on http://bit.ly/p-and-c.

Focus Phase: An open-source time-tracking command-line application with statistics and visualizations. It is built using Python and published on the Python Package Index. Github link: http://bit.ly/focus-phase.

S3upload: An open-source Python application that makes it faster to upload a large number of files to AWS S3. Github link: http://bit.ly/s3upload.

UM Certification UM Certification UM Certification UM Certification UM Certification UM Certification

Certifications


IBM Data Science Professional Certificate (2019)

  • Courses include: Data Analysis with Python, Databases and SQL for Data Science, Data Science Methodology, Machine Learning with Python, Data Visualization with Python, and Open Source tools for Data Science. More info here: http://bit.ly/IBM-ds
UM Certification