By Rebecca Y. Bayeck, Ph.D.
This blogpost introduces you to the basics of Wikidata editing (my colleague jay winkler explains querying Wikidata in a separate blogpost).
Wikidata is “a free, collaborative, multilingual, secondary database, collecting structured data to provide support for Wikipedia, Wikimedia Commons, the other wikis of the Wikimedia movement, and to anyone in the world.” In other words, Wikidata is a collection of organized data, input, maintained and edited by individuals that relies on other databases to make information available to anyone in the world. As a collection of data available to anyone in the world and in multiple languages, editing in Wikidata means adding, inputting, adjusting, removing, or making changes to data already in Wikidata. Yet, I will add that only registered users can edit Wikidata.
So, how does one get started with editing Wikidata?
Figure 1. Wikidata Website
The Process of Wikidata Editing
As previously mentioned, Wikidata is available to anyone in the world, which implies that anyone can edit following the structure and concept provided by the system. To start editing, one needs to:
Create a free account. If you do not create an account, your IP address will be listed in order to track any changes you’ve made so be sure to create an account and log into that account any time you are making edits.
Figure 2. Creating a Wikidata Account
Click on create a new item, which in this context, means to add anything related to human knowledge, including topics, concepts, names, and objects. If the item you are trying to create already exists, you can simply contribute to the existing record (see step 3).
Figure 3. Item Page Creation
On the Create a new item page, enter item information as described and click on Create
Bravo!! You have contributed to or edited Wikidata!
Creating a new item page (Figure 3) engages you in the editing of Wikidata. Still, there is more to an item page. When a new item is created, Wikidata, assigns it a unique identifier, also called Q number, which helps distinguish items with identical/similar names. For instance, the likelihood of having multiple items with the name Edward Smith is high. Hence, to differentiate the items, Wikidata assigns Q numbers to each item (Figure 3), and also provides a space for the editor to add more information about the item such as statements, properties, and values.
Figure 4. Q Identifier of an Item
A statement is how the information or data we have about an item gets saved in Wikidata. A property can be understood as a category of data in Wikidata. It is a value added to the statement of an item. Just like an item, a property has a unique identifier, or a P number, with the exception that it does not have it’s own page. Wikidata has a list of properties, and each property has its own label and description. Many properties also require a reference URL or an edit to an associated post (e.g. if item A is the child of item B, item B needs to include a property stating it is the parent of item A), so you may see a flag or exclamation point icon next to a property. Click the icon for details of any edits or requirements the database is raising.
Figure 4 above shows an item page with its Q number, and other properties, which makes this item unique. For instance, for item Lynn Washington, the following properties: occupation (P106), ethnicity (P172), and exhibition history (P608) are included.
Editing Wikidata is dependent on the information or data one has available. Furthermore, editing in Wikidata also means that one can add to an existing item page. The process described above captures the manual editing of Wikidata. Another approach to Wikidata editing is with the use of open source applications such as OpenRefine.
Editing Wikidata with OpenRefine
OpenRefine allows for an automated and rapid editing of Wikidata with the import of data saved in files like TSV, CSV, *SV, and Excel (.xls and .xlsx) . This section is a brief introduction to editing with OpenRefine. To start the editing process, you need to download and install OpenRefine.
Launch OpenRefine and import the data.
Figure 5. Importing Data into OpenRefine
Select the appropriate file and click next to upload the data. Click on the down arrow (see Wikidata: Tools/OpenRefine), and in the dialog box, select reconcile then start reconciling (Figure 6). OpenRefine has a Wikidata extension, which allows OpenRefine to match data in the file with Wikidata. Instead of manually searching for each item in Wikidata, the editor can in a couple of minutes know which items are available in Wikidata, and which data are not.
Figure 6. OpenRefine Matching Process
In the dialog box that opens, click on Wikidata reconciliation for OpenRefine, then select the type of data. In this case, I selected humans given that our data are about artists/humans before hitting start reconciliation.
A complete reconciliation of data can yield the following results, or look similar to Figure 7. On the left pane, select match to see data in the files that matched Wikidata records, and none to see the data not found in Wikidata.
Figure 7. Sample of Results from the Reconciliation Process
Editing here will then consist in adding the data not matching into Wikidata manually. However, the benefit of using OpenRefine resides in the rapid identification of matching and non-matching records.
This blog is an invitation to contribute to Wikidata editing. As previously stated, anyone can edit Wikidata, and there is still a lot of knowledge not included in this open database for information collection. The diversity of editors in terms of knowledge, cultural background, geographic location, and ethnicity can broaden and expand what is added in the database. Check out this blog post to get started designing a Wikidata project for your community or institution.
There is no doubt that the question of data ownership can be posed, especially for communities with limited resources, or communities that have been historically marginalized. Being open and available to anyone in the world does not imply ownership of the information added to the database by the editor or the community with the information. For instance, while anyone can edit Wikidata, only administrators can delete a page or an item.
Yet, the issue of data ownership cannot be addressed in this blog. Still, it is critical to raise this question. The historical experiences of these communities may either have them resist contribution to Wikidata as a means of preservation or protection; or they may be marginalized/overlooked because these groups are often not thought of as critical participants in these kinds of endeavors.