Introduction to SHACL
SHACL (Shapes Constraint Language) is a W3C recommendation for validating RDF data. It allows you to define constraints that your RDF data must satisfy to be considered valid.
With SHACL, you can:
- Validate the structure and content of RDF graphs
- Ensure data consistency and quality
- Define rules for what constitutes valid data in your domain
- Generate helpful validation reports when data doesn't conform
SHACL Basics
SHACL is built around two key concepts:
- Data Graph: The RDF graph that you want to validate
- Shapes Graph: The RDF graph containing validation rules (shapes)
When you run a SHACL validator, it checks the Data Graph against the constraints defined in the Shapes Graph and produces a validation report.
In this example, we're validating that all instances of ex:Person have an ex:age property that is a non-negative integer.
SHACL Components Matching Exercise
Match each SHACL concept with its correct description:
1. Data Graph
2. Shapes Graph
3. Validation Report
4. Node Shape
NodeShapes
In SHACL, NodeShapes define constraints that apply to nodes in the RDF graph. They target specific nodes using one of several targeting mechanisms:
- Class-based targeting: Apply to all instances of a specific class
- Node targeting: Apply to specific individual nodes
- Subject-of targeting: Apply to subjects of specific predicates
- Object-of targeting: Apply to objects of specific predicates
NodeShapes can contain property constraints (PropertyShapes) and can directly specify constraints like sh:closed to restrict which properties are allowed.
NodeShape Targeting Exercise
Match each targeting mechanism with its purpose:
1. sh:targetClass
2. sh:targetNode
3. sh:targetSubjectsOf
4. sh:targetObjectsOf
PropertyShapes
PropertyShapes define constraints on properties and their values. They are typically nested within NodeShapes using the sh:property predicate.
Key aspects of PropertyShapes include:
- Property Path: Specifies which property the constraints apply to
- Value Type Constraints: Restrict the type of values (datatype, class, node kind)
- Cardinality Constraints: Control how many values a property can have
- String Constraints: Define patterns, length restrictions for string values
- Value Range Constraints: Specify minimum/maximum values for numbers
PropertyShape Exercise
Complete the PropertyShape by dragging the correct components to the appropriate locations:
Create a PropertyShape that validates a person's age is between 0 and 120 years:
Constraint Types
SHACL provides a wide range of constraint types. Here are some of the most commonly used:
Value Type Constraints
- sh:datatype: Specifies the expected RDF datatype (e.g., xsd:string)
- sh:class: Requires values to be instances of a specific class
- sh:nodeKind: Restricts node kind (sh:IRI, sh:BlankNode, sh:Literal)
Cardinality Constraints
- sh:minCount: Minimum number of values
- sh:maxCount: Maximum number of values
Value Range Constraints
- sh:minExclusive/sh:minInclusive: Minimum value
- sh:maxExclusive/sh:maxInclusive: Maximum value
String Constraints
- sh:pattern: Regular expression pattern
- sh:minLength/sh:maxLength: String length limitations
Property Pair Constraints
- sh:equals: Two properties must have equal values
- sh:disjoint: Two properties must have different values
Logical Constraints
- sh:not: Negation of a constraint
- sh:and: Conjunction of constraints
- sh:or: Disjunction of constraints
- sh:xone: Exactly one constraint must be true
Constraints Quiz
Choose the most appropriate constraint for each validation requirement:
1. For validating that an email property follows a standard email format:
2. For ensuring a temperature value is between -50 and 50 degrees:
3. For verifying that a property value is an IRI (not a literal or blank node):
Cardinality Constraints
Cardinality constraints in SHACL control how many values a property can or must have. They are essential for ensuring data completeness and preventing data redundancy.
SHACL provides two primary cardinality constraints:
- sh:minCount: The minimum number of values required (e.g., "at least one email address")
- sh:maxCount: The maximum number of values allowed (e.g., "at most one birth date")
Cardinality Constraints Exercise
For each scenario, select the appropriate cardinality constraints:
Scenario 1:
A person must have exactly one unique identifier.
Scenario 2:
A sensor may have zero or more measurement observations.
Scenario 3:
A book must have at least one author but could have multiple authors.
Validation Severity Levels
SHACL provides different severity levels to indicate the importance of constraints:
- sh:Violation - The most severe level, indicating that the data must be fixed
- sh:Warning - Less severe, suggesting the data should be reviewed
- sh:Info - Informational only, with no corrective action required
The default severity level is sh:Violation if not specified.
Exercise: Choosing Appropriate Severity Levels
For a constraint that validates whether a person's email follows correct format:
For a constraint that checks if a publication has at least one keyword:
Practical SHACL Examples
Let's examine some real-world examples of SHACL validation in different domains:
Example 1: Validating Person Data
Example 2: Validating Academic Publications
SHACL Shape Builder
Build a SHACL shape for validating bibliographic entries by selecting the appropriate components: