Introduction to SPARQL
SPARQL (SPARQL Protocol and RDF Query Language) is the standard query language for RDF data. It allows you to:
- Extract specific information from RDF graphs
- Perform complex pattern matching across interconnected data
- Filter and transform results based on specific criteria
- Aggregate data for analysis and reporting
In this module, you'll learn the basics of SPARQL syntax, how to build queries, and practice extracting information from RDF data.
SPARQL Basics
What is SPARQL?
SPARQL (pronounced "sparkle") is to RDF what SQL is to relational databases. It's a specialized query language designed to retrieve and manipulate data stored in RDF format.
Because RDF data is structured as a graph of triples (subject-predicate-object), SPARQL is built around pattern matching across these triple structures.
Key Insight
SPARQL queries work by matching patterns in the RDF graph. Think of it as searching for specific shapes or connections within the graph structure.
Understanding Graph Patterns
A basic SPARQL query typically consists of:
PREFIX ex: <http://example.org/>
SELECT ?subject ?object
WHERE {
?subject ex:predicate ?object .
}
This query finds all pairs of subjects and objects that are connected by the predicate "ex:predicate".
SPARQL Query Structure
Anatomy of a SPARQL Query
A complete SPARQL query typically includes these components:
- PREFIX declarations: Define shorthand prefixes for URIs
- Query form: SELECT, ASK, CONSTRUCT, or DESCRIBE
- Dataset definition: Specifies which RDF graphs to query
- Result modifiers: ORDER BY, LIMIT, OFFSET, etc.
- WHERE clause: Contains the graph pattern to match
Match the Components
Match each SPARQL component with its correct description:
SELECT Queries
Building SELECT Queries
SELECT queries are the most common SPARQL query type. They return a tabular result of variables and their bindings, similar to SQL's SELECT statement.
A basic SELECT query looks like this:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?email
WHERE {
?person foaf:name ?name .
?person foaf:mbox ?email .
}
This query finds names and email addresses of all people in the dataset.
Complete the Query
Fill in the blanks to create a valid SPARQL query that finds all books and their authors:
PREFIX dc: <http://purl.org/dc/elements/1.1/> ____ ?book ?author WHERE { ?book a <http://example.org/Book> . ?book ____ ?author . }
Query Type:
Predicate:
Filters & Operators
Refining Results with Filters
SPARQL filters allow you to restrict results based on specific conditions. Common operators include:
- =, !=, <, >, <=, >=: Comparison operators
- &&, ||, !: Logical operators (AND, OR, NOT)
- REGEX(): Regular expression matching
- BOUND(): Checks if a variable is bound
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?book ?title
WHERE {
?book dc:title ?title .
FILTER(REGEX(?title, "SPARQL", "i"))
}
This query finds books with "SPARQL" in their title (case-insensitive).
Choose the Right Filter
Select the correct SPARQL filter expression for each scenario:
1. Find people older than 30:
2. Find books published after 2020 with "Data" in the title:
OPTIONAL Patterns
Handling Missing Data
The OPTIONAL keyword in SPARQL is similar to a LEFT JOIN in SQL. It allows patterns to match even when some parts of the pattern don't have matches in the data.
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?email
WHERE {
?person foaf:name ?name .
OPTIONAL { ?person foaf:mbox ?email }
}
This query returns all names, and email addresses when available. People without email addresses will still appear in results with the ?email variable unbound.
Build an OPTIONAL Query
Drag the components to build a SPARQL query that finds all books and their optional descriptions:
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?book ?title
WHERE {
?book a <http://example.org/Book> .
?book dc:title ?title .
}
Aggregations
Counting and Grouping Results
SPARQL supports aggregate functions similar to SQL, including COUNT, SUM, AVG, MIN, and MAX.
These are typically used with GROUP BY to group results by specific variables.
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?author (COUNT(?book) AS ?bookCount)
WHERE {
?book dc:creator ?author .
}
GROUP BY ?author
ORDER BY DESC(?bookCount)
This query counts how many books each author has written and sorts by the count in descending order.
Aggregation Exercise
Complete the aggregation query by selecting the correct functions:
PREFIX ex: <http://example.org/>
SELECT ?publisher
(_____ AS ?totalBooks)
(_____ AS ?avgPrice)
WHERE {
?book ex:publisher ?publisher .
?book ex:price ?price .
}
_____
HAVING (_____ > 5)
ORDER BY DESC(?avgPrice)
Practical Examples
Real-world SPARQL Queries
Let's examine some practical SPARQL queries you might use in real applications.
Query Analysis
Review each SPARQL query and select what it accomplishes:
PREFIX schema: <http://schema.org/>
PREFIX sosa: <http://www.w3.org/ns/sosa/>
SELECT ?sensor ?value ?time
WHERE {
?sensor a sosa:Sensor .
?observation sosa:madeBySensor ?sensor .
?observation sosa:hasSimpleResult ?value .
?observation sosa:resultTime ?time .
FILTER(?time > "2023-01-01T00:00:00Z"^^xsd:dateTime)
}
What does this query do?
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT ?concept ?label ?broader
WHERE {
?concept a skos:Concept .
?concept skos:prefLabel ?label .
OPTIONAL {
?concept skos:broader ?broaderConcept .
?broaderConcept skos:prefLabel ?broader .
}
FILTER(LANG(?label) = 'en')
}
What does this query do?