top of page

Using ChatGPT to Refine REGEX in SPARQL queries

Writer's picture: Michael DeBellisMichael DeBellis

In a previous post, I shared some SPARQL queries that can take user defined IRIs that follow CamelBack naming standards for classes and instances and reverseCamelBack for properties and generate labels using a REGEX expression. One bug in those queries was that they didn't handle acronyms correctly. E.g., if there was a class called NASAEmployee, the generated string would be "N A S A Employee". Also, for the classes and instances, there would be a leading blank at the start of the string that I had to remove. It took me several hours to figure out the REGEX expression for those queries and I never thought it was worth the time to figure out how to do the REGEX to handle acronyms. Instead, I would just change any acronym names by hand. But I tried using ChatGPT to create the appropriate REGEX and it worked. To see the dialog with ChatGPT that created these REGEX expressions see the Wiki page on my SPARQL utilities GitHub repository.


I've updated the SPARQL files on the Github repository. Remember there are two files, one for use in SnapSPARQL in Protégé that uses CONSTRUCT and another for other SPARQL implementations that uses INSERT. Also, if you just want to copy/paste, here are the revised queries for use within Snap SPARQL:

PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
#Create labels for all Classes
CONSTRUCT {?c rdfs:label ?lblname.}
WHERE {?c rdfs:subClassOf owl:Thing.
	BIND(STRAFTER(STR(?c), 'test/') as ?name)
	# First, separate lowercase-uppercase transitions that are not part of an acronym
	BIND(REPLACE(?name, "([a-z])([A-Z])", "$1 $2") AS ?step1)
	# Then, separate acronyms from following words
	BIND(REPLACE(?step1, "([A-Z]+)([A-Z][a-z])", "$1 $2") AS ?lblname)
	OPTIONAL{?c rdfs:label ?elbl.}
	FILTER(?c != owl:Thing && ?c != owl:Nothing && !BOUND(?elbl))}	#Create labels for all Individuals		
CONSTRUCT {?i rdfs:label ?lblname.}
WHERE {?i a owl:Thing.
	BIND(STRAFTER(STR(?i), 'test/') as ?name)
	# First, separate lowercase-uppercase transitions that are not part 	of an acronym
	BIND(REPLACE(?name, "([a-z])([A-Z])", "$1 $2") AS ?step1)
	# Then, separate acronyms from following words
	BIND(REPLACE(?step1, "([A-Z]+)([A-Z][a-z])", "$1 $2") AS ?lblname)
	OPTIONAL{?i rdfs:label ?elbl.}
	FILTER(!BOUND(?elbl))}			
#Create labels for all Object Properties
CONSTRUCT {?p rdfs:label ?lblname.}
WHERE {?p a owl:ObjectProperty.
	BIND(STRAFTER(STR(?p), 'test/') as ?name)
	# First, separate lowercase-uppercase transitions that are not part of an acronym
	BIND(REPLACE(?name, "([a-z])([A-Z])", "$1 $2") AS ?step1)
	# Then, separate acronyms from following words
	BIND(REPLACE(?step1, "([A-Z]+)([A-Z][a-z])", "$1 $2") AS ?lblname)
	OPTIONAL{?p rdfs:label ?elbl.}
	FILTER(?p != owl:topObjectProperty &&!BOUND(?elbl))}	
#Create labels for all Data Properties
CONSTRUCT {?p rdfs:label ?lblname.}
WHERE {?p a owl:DatatypeProperty.
	BIND(STRAFTER(STR(?p), 'test/') as ?name)
	# First, separate lowercase-uppercase transitions that are not part of an acronym
	BIND(REPLACE(?name, "([a-z])([A-Z])", "$1 $2") AS ?step1)
	# Then, separate acronyms from following words
	BIND(REPLACE(?step1, "([A-Z]+)([A-Z][a-z])", "$1 $2") AS ?lblname)
	OPTIONAL{?p rdfs:label ?elbl.}
	FILTER(?p != owl:topDataProperty && !BOUND(?elbl))}

The following shows an example of using the new queries on some test classes:


26 views0 comments

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
  • facebook
  • linkedin

©2019 by Michael DeBellis. Proudly created with Wix.com

bottom of page