Quotidien Shaarli
March 24, 2025

Parquet devrait remplacer le format CSV
29 décembre 2022par Éric MauvièreCartographie, Outils, R8 commentaires
Parquet est un format ouvert de stockage de jeux de données. Créé en 2013 par Cloudera et Twitter, longtemps réservé aux pros du big data, il a beaucoup gagné en popularité ces derniers mois. Bien plus compact, super-rapide à lire, compris par davantage d’outils, Parquet est devenu une alternative crédible à l’omniprésent CSV.

A fast viewer for CSV and Parquet files and SQLite and DuckDb databases that supports large files.
pivot
It's a Pivot Table for analyzing and exploring data. Internally, Tad uses DuckDb for fast, accurate processing.
puzzle
Designed to fit in to the workflow of data engineers and data scientists.
About rclone
Rclone is a command-line program to manage files on cloud storage. It is a feature-rich alternative to cloud vendors' web storage interfaces. Over 70 cloud storage products support rclone including S3 object stores, business & consumer file storage services, as well as standard transfer protocols.
Rclone has powerful cloud equivalents to the unix commands rsync, cp, mv, mount, ls, ncdu, tree, rm, and cat. Rclone's familiar syntax includes shell pipeline support, and --dry-run protection. It is used at the command line, in scripts or via its API.
Users call rclone "The Swiss army knife of cloud storage", and "Technology indistinguishable from magic".
Rclone really looks after your data. It preserves timestamps and verifies checksums at all times. Transfers over limited bandwidth; intermittent connections, or subject to quota can be restarted, from the last good file transferred. You can check the integrity of your files. Where possible, rclone employs server-side transfers to minimise local bandwidth use and transfers from one provider to another without using local disk.