agate’s command structure is very similar to SQL. The primary difference between agate and SQL is that commands like
WHERE explicitly create new tables. You can chain them together as you would with SQL, but be aware each command is actually creating a new table.
All examples in this section use the PostgreSQL dialect for comparison.
If you want to read and write data from SQL, see Load a table from a SQL database.
SELECT state, total FROM table;
new_table = table.select(('state', 'total'))
SELECT * FROM table WHERE LOWER(state) = 'california';
new_table = table.where(lambda row: row['state'].lower() == 'california')
SELECT * FROM table ORDER BY total DESC;
new_table = table.order_by(lambda row: row['total'], reverse=True)
SELECT DISTINCT ON (state) * FROM table;
new_table = table.distinct('state')
Unlike most SQL implementations, agate always returns the full row. Use
Table.select() if you want to filter the columns first.
SQL (two ways):
SELECT * FROM patient, doctor WHERE patient.doctor = doctor.id; SELECT * FROM patient INNER JOIN doctor ON (patient.doctor = doctor.id);
joined = patients.join(doctors, 'doctor', 'id', inner=True)
LEFT OUTER JOIN¶
SELECT * FROM patient LEFT OUTER JOIN doctor ON (patient.doctor = doctor.id);
joined = patients.join(doctors, 'doctor', 'id')
Table.group_by() works slightly different than SQLs. It does not require an aggregate function. Instead it returns
TableSet. To see how to perform the equivalent of a SQL aggregate, see below.
doctors = patients.group_by('doctor')
Chain commands together¶
SELECT state, total FROM table WHERE LOWER(state) = 'california' ORDER BY total DESC;
new_table = table \ .select(('state', 'total')) \ .where(lambda row: row['state'].lower() == 'california') \ .order_by('total', reverse=True)
Chaining commands in this way is often not a good idea. Being explicit about each step tends to produce clearer code.
SELECT mean(age), median(age) FROM patients GROUP BY doctor;
doctors = patients.group_by('doctor') patient_ages = doctors.aggregate([ ('age', agate.Length(), 'patient_count') ('age', agate.Mean(), 'age_mean'), ('age', agate.Median(), 'age_median') ])
The resulting table will have four columns: