Data scraping

10 hours, 9 minutes ago
Full-time
Mid Level
Software Development
Distro

Distro

DISTRO is a global platform for companies to interview and hire candidates for full-time roles efficiently.

Internet Software & Services
11-50

Description

  • Research and identify public and government data sources.
  • Extract, transform, and normalize data from websites, APIs, feeds, FTP sources, and online repositories.
  • Design and build reusable, scalable, and maintainable ETL processes and workflows.
  • Apply advanced web scraping techniques using Python, HTTP requests, and HTML parsing.
  • Ensure data quality by identifying inconsistencies and validating data samples.
  • Document methodologies and processes for data acquisition and transformation.
  • Maintain repositories using Git and follow development best practices.
  • Communicate clearly and collaborate with stakeholders.

Requirements

  • Solid experience in web scraping, data scraping, and extracting structured and unstructured data.
  • Practical programming experience in Python or similar languages.
  • Knowledge of APIs, HTTP, FTP, HTML parsing, and relational databases such as PostgreSQL.
  • Advanced English level with fluent written and technical communication.
  • Strong analytical mindset and ability to solve complex data acquisition problems.
  • Ability to optimize solutions and work independently on technical challenges.
  • Strong focus on data validation, normalization, and documentation.
  • Availability to work Monday to Friday from 12:00 PM to 8:00 PM CST.

Benefits

  • Full-time remote work.
  • Monthly salary of $1,200 to $1,300.
  • Opportunity to work on automation and data acquisition challenges in a company-wide role.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Data Engineer (PowerBI & Data Integration)

3Pillar Global 1K-5K Internet Software & Services

3Pillar Global is hiring a Senior Data Engineer in Guatemala to support client-facing data analytics and integration work for enterprise teams in a remote-first, global delivery environment.

Apache Apache Airflow Apache Spark AWS Azure GCP Hadoop Hibernate Hive Java MapReduce NumPy Pandas Power BI Python Redis Snowflake Spring Spring Boot SQL Tableau
9 hours, 39 minutes ago

Senior Data Engineer ( min 4+ years of experience in Python, Snowflake & DBT)

Outreach 1K-5K Internet Software & Services

Outreach is hiring a Senior Data Engineer for its Business Systems team in Hyderabad to build and optimize data pipelines, models, and analytics infrastructure that support finance, analytics, support systems, and broader revenue operations.

Apache Airflow Apache Spark AWS Databricks dbt GitHub Kafka Microservices Python RabbitMQ Snowflake
9 hours, 54 minutes ago

Senior Data Engineer (PowerBI & Data Integration)

3Pillar Global 1K-5K Internet Software & Services

3Pillar is hiring a Senior Data Engineer in Mexico to support product delivery for U.S.-based clients by building and maintaining data pipelines and analytics solutions for enterprise data-driven decision-making.

Apache Airflow Apache Spark AWS Azure EC2 GCP Hadoop Hibernate Hive Java MapReduce NumPy Pandas Power BI Python Redis Snowflake Spring Spring Boot SQL Tableau
9 hours, 54 minutes ago

Data Engineer

Softeta 51-250 Internet Software & Services

Softeta is seeking a Senior Data Engineer for a banking client to design and improve data systems that support ingestion, processing, storage, and integration in a high-stakes financial environment.

Agile Apache Airflow Apache Spark AWS Azure dbt Docker GCP Kafka Kubernetes Python Scrum Snowflake SQL
9 hours, 54 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers