Open Innovation Campus

Location

Madrid

Weekly hours

25

Expected Start Date

April, 2025

Job Profile

Technology

Speciality

Data Engineering

Language

Español / English

DATA MANAGEMENT AND VALIDATION

Description of the team offering the internship

In the Applied AI & Privacy Area we support the digital transformation of Telefónica's operations (OBs) through the design and development of data-driven products (Big Data, ML/IA) and their software architecture. From the prototyping phase to industrialization and production, we work with different business areas of Telefónica guaranteeing privacy from the design stage.

What will you learn during your internship with us?

The intern will contribute to the management and validation of data that feeds the company's products, ensuring the quality and consistency of the information used in various operations. Your work will be key in the extraction, cleaning, processing and validation of data coming from different sources and countries, ensuring that they meet the required quality standards for use in analytical and artificial intelligence models. MAIN ACTIVITIES AND RESPONSIBILITIES: Support in the extraction and processing of data from various sources for further analysis. Validate the quality of input data to products, identifying possible inconsistencies or errors. Collaborate in data cleansing, transformation and structuring in Big Data environments. Participate in the documentation of data validation processes and in the generation of quality reports. Support in the automation of data validation and monitoring processes through Python and PySpark scripts. Collaborate with multidisciplinary teams to ensure proper integration of data into analytical products and machine learning models. Participate in meetings with teams from different countries to understand requirements and particularities of the data used.

Training and skills you will need to develop this internship

Student of the last semesters or recent graduate of STEM careers (engineering, mathematics, statistics, physics, etc.). Knowledge in data manipulation and analysis with Python. Basic knowledge in Big Data environments and tools such as PySpark. Familiarity with SQL queries for data extraction and validation. It will be an asset: Knowledge in data visualization tools (Seaborn, Matplotlib, Power BI, Tableau, etc.). Previous experience in data management in cloud environments (AWS, GCP, Azure). Knowledge in agile methodologies and version control tools (Git, GitHub).