functional data

Analysis using functional data of the climate of an area

July 11, 2023

Climate change is the change that occurs in weather patterns associated in large part with human action and that is much commented on in the media. There is no doubt, except to people who are blind to the evidence, that it is happening and that it will plague the entire planet for years to come. In this article I propose a model to analyze the change that occurred during a specific month of two consecutive years using functional data. It is not proof of climate change. The correlation does not indicate causation, but it is interesting to see how on a scale of one year, which geological is negligible, slight changes are noticed.

In this article, we will work with functional data and create graphs of the evolution of temperature throughout the days of June 2023, facing the values of 2021. Python and ChatGPT are used.

What is functional data?

Functional data analysis is a branch of statistics that analyzes data by providing information about curves, surfaces, or any other variable that changes over a continuum. Each element in the sample is considered a function. The physical continuum on which functions are often defined is time, but spatial location, wavelength, probability can also be used. etc.

A random variable χ is called a functional variable if it takes values in a complete metric space or in a semimetric functional. An observation x of χ is called a functional datum.

Steps to start working with functional data

The main steps in functional data analysis are:

  • Collect, clean and sort raw data.
  • Convert that data into its functional form.
  • Explore the data using plot charts and summary statistics.
  • Perform exploratory analyses, such as principal component analysis.
  • Record the data, if necessary, so that important values occur in the same arguments.
  • Build a model.
  • Finally, evaluate the performance of the model.

Workstation

To carry out this work I have used the data  of the Castro Vicaludo weather station, located in Oia, in the province of Pontevedra, Spain. The Castro Vicaludo station (Fig. 1) is located at 42º latitude and -8.86º longitude with an altitude of 473 meters and has been collecting data since April 28, 2004.

Figure 1. Meteorological station of Castro Vicaludo, Oia, Pontevedra, Spain.

The measurement period will be two periods and ten-minute variables are used, that is, data are collected every 10 minutes:

  • Ten-minute variables. Consultation period: 01-06-2023 to 30-06-2023
  • Ten-minute variables. Consultation period: 01-06-2021 to 30-06-2021

Functional data of maximum temperature at 1 meter per day

Download the data from page https://www.meteogalicia.gal and work with two json files:

  • oia062021.json, which has the temperature data at 0.1 meters measured for every day in the month of June 2021 at 10-minute intervals.
  • OIA062023.json, which has the temperature data at 0.1 meters measured for every day in the month of June 2021 at 10-minute intervals.

The term “header” does not apply directly to JSON files, as JSON files do not have a specific header like CSV files or plain text files. However, if you want to see the structure or primary keys of a JSON file, you can follow these steps:

  1. Read the JSON file: Opens the file and reads its contents in a variable. You can do this using the functions provided by your programming language.
  2. Parse the JSON: Once you have read the JSON file, you must parse it to convert it into a JSON object in your programming language. This will allow you to access the keys and values of the JSON.
  3. View JSON keys: You can display JSON primary keys using the functions provided by your programming language. This will give you an idea of the structure of the JSON file.

You can use this Python code:

import json

with open('oia062023.json') as file:
    data = json.load(file)

# Obtener las claves del JSON
keys = data.keys()

# Mostrar las claves
for key in keys:
    print(key)

From the json file we extract the information regarding average temperature at 0.1 meters:

import json
import matplotlib.pyplot as plt
from datetime import datetime

with open('oia062023.json') as file:
    data = json.load(file)

fechas = []
temperaturas = []

for resultado in data['resultados']:
    fecha = datetime.strptime(resultado['Data'], "%Y-%m-%d %H:%M:%S.%f")
    temperatura = resultado['Valor']
    fechas.append(fecha)
    temperaturas.append(temperatura)

A graph is created where the X-axis is the time interval, measured everyday minutes throughout a day, and the Y-axis is the value of the average temperature. This code is used:

import json
import matplotlib.pyplot as plt
from datetime import datetime

with open('oia062023.json') as file:
    data = json.load(file)

datos_por_dia = {}

for resultado in data['resultados']:
    fecha = datetime.strptime(resultado['Data'], "%Y-%m-%d %H:%M:%S.%f").date()
    temperatura = resultado['Valor']
    if fecha not in datos_por_dia:
        datos_por_dia[fecha] = {'temperaturas': []}
    datos_por_dia[fecha]['temperaturas'].append(temperatura)

# Crear una gráfica para cada día
for fecha, datos in datos_por_dia.items():
    plt.plot(datos['temperaturas'], label=fecha.strftime("%d"))

# Configurar etiquetas y título del gráfico
plt.xlabel('Intervalo de tiempo')
plt.ylabel('Temperatura (ºC)')
plt.title('Evolución de la temperatura a 0.1m por día')

# Reformatear la leyenda
plt.legend(title='Día del mes', loc='center left', bbox_to_anchor=(1, 0.5))

# Mostrar la gráfica
plt.show()

With this code this result is obtained for the year 2023. Each day of the month has an associated curve and color. On the X axis it is the time interval from 0.10 am to 11.50 pm, that is, 144 data points, as they are ten-minute values (values taken every 10 minutes throughout the day).

datos funcionales

Comparing the data with the year 2021

Now the data for 2023 will be compared with those for 2021. To obtain the 2021 data they are downloaded again from meteogalicia.gal. And you get this graph from 2021.

datos funcionales

Statistical variables to compare both years

To compare both years we will use the following Python code with some statistical formulas:

import json
from datetime import datetime
import statistics

# Leer los datos del primer archivo JSON (062023.json)
with open('oia062023.json') as file:
    data_2023 = json.load(file)

# Leer los datos del segundo archivo JSON (062021.json)
with open('oia062021.json') as file:
    data_2021 = json.load(file)

# Extraer las temperaturas de cada archivo JSON
temperaturas_2023 = [resultado['Valor'] for resultado in data_2023['resultados']]
temperaturas_2021 = [resultado['Valor'] for resultado in data_2021['resultados']]

# Realizar cálculos estadísticos en los datos de 2023
temperatura_promedio_2023 = statistics.mean(temperaturas_2023)
temperatura_maxima_2023 = max(temperaturas_2023)
temperatura_minima_2023 = min(temperaturas_2023)
temperatura_mediana_2023 = statistics.median(temperaturas_2023)
desviacion_estandar_2023 = statistics.stdev(temperaturas_2023)

# Realizar cálculos estadísticos en los datos de 2021
temperatura_promedio_2021 = statistics.mean(temperaturas_2021)
temperatura_maxima_2021 = max(temperaturas_2021)
temperatura_minima_2021 = min(temperaturas_2021)
temperatura_mediana_2021 = statistics.median(temperaturas_2021)
desviacion_estandar_2021 = statistics.stdev(temperaturas_2021)

# Imprimir los resultados
print("Resultados para el año 2023:")
print(f"Temperatura promedio: {temperatura_promedio_2023} °C")
print(f"Temperatura máxima: {temperatura_maxima_2023} °C")
print(f"Temperatura mínima: {temperatura_minima_2023} °C")
print(f"Temperatura mediana: {temperatura_mediana_2023} °C")
print(f"Desviación estándar: {desviacion_estandar_2023} °C")
print()
print("Resultados para el año 2021:")
print(f"T. promedio: {temperatura_promedio_2021} °C")
print(f"T. máxima: {temperatura_maxima_2021} °C")
print(f"T. mínima: {temperatura_minima_2021} °C")
print(f"T. mediana: {temperatura_mediana_2021} °C")
print(f"Desviación estándar: {desviacion_estandar_2021} °C")

And you get this result organized as a table:

|   Año |   T. average (°C) |   T. max (°C) |   T. min (°C) |   T. median (°C) |   Standard desviation(°C) |
|------:|-------------------:|-----------------:|-----------------:|------------------:|---------------------------:|
|  2023 |            19.1679 |            33.95 |            10.64 |            17.665 |                    4.98359 |
|  2021 |            16.9068 |            39.65 |             5.52 |            14.97  |                    6.2705  |

Resumen

Temperature data from June 2023 have been compared with those from June 2021. According to the data obtained, it can be seen that the average temperature of the year 2023 was 19.16 ºC, while in 2021 it was 16.90 ºC. The maximum temperature in 2023 was 33.95, while in 2021 it was 39.65. The minimum temperature in 2023 was 10.64, 5.52 ºC in 2021. The median (the mean value of the data) was 17.66 in 2023 versus 14.97 in 2021.

Finally, the two years are compared in the following image, right 2023 and left 2021:

User Avatar

Avelino Dominguez

👨🏻‍🔬 Biologist 👨🏻‍🎓 Teacher 👨🏻‍💻 Technologist 📊 Statistician 🕸 #SEO #SocialNetwork #Web #Data ♟Chess 🐙 Galician

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

graphs
Previous Story

Create charts in ChatGPT in seconds

climate change datos funcionales
Next Story

Análisis mediante datos funcionales del clima de una zona

Top

Don't Miss

números primo primo

Verificar si un número es primo o no utilizando Python

Este artículo explica cómo saber si
prime numbers

Check if a number is prime or not using Python

This article explains how to tell