This post
is number five in the series of posts dealing with working out a possible methodology for assessing and accounting for
databases containing Shakespearean texts. After an introductory post four other
ones have been dedicated to listing and explaining, contextualizing questions
that might come in handy when pondering about these databases. So far areas of
basic facts, transparency and flexibility were covered in the first three
posts, and now, as I have promised I am going to meditate and present questions
pertaining to what I would like to term as “interdisciplinary openness.”
Most of the
databases reduce texts to their linguistic aspect. Queries focus on words,
strings of words, linguistic units, grammatical units and verbal statistics.
They can also visualize tendencies, create diagrams in a variety of formats
about the linguistic construction of the text. All this is fine, as most of the
time when reading a Shakespearean play the reader will be interested in the
ways a text communicates its layers of meaning through verbal means. There has
been, however, a tendency in scholarly circles claiming in a great number of
ways that a text does not only reveal layers of meaning via its linguistic
construction but that meaning is also a social construct embedded in the
material ways a text functions in the world. So, scholars claim that bibliographical data
from the date of publication to publisher, from the typeset to the type of
paper, from decoration to page size play their part in the process of
constituting meaning. Here, a long list of authors, theoretical and pragmatic
may be presented from David Scott Kastan to John N. King, from Woudhuysen to
McGann, from Shillingsburg to Hayles, from Marshall McLuhan to Andrew Murphy to
mention a few authorities in the field. It is beneficial if a database allows
for research other than ones pertaining to the linguistic aspect. The next
three questions, thus, explore ways in which a database may cater for interests
in aspects other than the linguistic one.
- Format of the digital text (txt, xml, jpg, tiff etc.)
Interdisciplinary
research presupposes the complexity of possible questions to be asked, and this
complexity can only be provided through presenting the texts in a variety of
formats. Sometimes the best choice is to have a rather unmarked list of
words, e.g. in a txt file, this is sufficient and even more fruitful for some
queries, especially when it is not clear how the file is read by a text
analysis tool. For another set of questions encoding is needed, say for
tokenised or lemmatised queries, other times it is the best if there are images
only that may be analyzed in ways unimaginable before. It is the format of the
file that enables these differing approaches, so it is fine if the same text is
accessible in a variety of formats.
- Is it the linguistic, digital or bibliographic aspect that is emphasized?
The
linguistic aspect refers to the language, linguistic elements of the digital
text. The bibliographical aspect refers to the material aspect, but in this very
case, this does not define the digital text, as digital, but as an outcome of
the visual aspect of some original printed material. The digital aspect refers
to the computational coding of a text that enables the visual aspect and also
the searchable quality of these texts. It is clear that builders of databases have to
decide on what they intend to achieve. Unfortunately there is no such database
that would/could lay equal emphasis on every aspect of a digital text.
Databases vary among paying special attention to the text as a linguistic unit,
or to the text as a deeply encoded entity that allows for complex and
intelligent queries, or to aspects that are relevant for the historian of the
book.
- Which aspect of the text is open to queries?
If it is
possible to present the text in a variety of formats, thus a variety of disciplinary
approaches may be occasioned within the database. If this is so, it is also
relevant which aspect of the text is open to queries, as it is a query that
makes computer enabled research fruitful. It is the query that makes research
faster and more accurate, so it is great if the image file is there that
enables research related to the history of the book, but if this aspect of the
text is not open to queries, computation is like a disabled giant: it is there
but the scholar cannot make use of the power of computer technology. The Text
Encoding Initiative enables marking up a text for queries about the visual
aspect of a work, and there are even free image mark-up tools, so
technologically it is not impossible to prepare a database in which the
bibliographical code is open to queries.
* * *
This time,
thus, we have seen the remaining three criteria for assessing a database. These
questions covered practically an area that I have labeled as “interdisciplinary
openness.” The interdisciplinarity of a database manifests itself in the
variety of formats of the files, the types of queries that a user may conduct.
Naturally, these criteria may or may not be true for each and every database and
can only be used as a means of orientation. So neither these three criteria nor
the other thirteen should be thought of as complete and compelling ones, but
rather as means to be able to discuss critically a database or databases. What
follows form this is that a positive assessment does not necessarily mean that one
can give the highest possible scores for each and every criterion, as it can
easily happen that a database can fruitfully be used even though reviewing it
with the help of the above sixteen criteria should suggest that the database is
less good. Assessment at its best relies on criteria relevant to the individual
database. Having thus finished the meditation about the criteria of assessment,
next time I shall start a new series of posts exploring databases one by one.
No comments:
Post a Comment