Bergen Language Design Laboratory (BLDL)
BLDL has an internal meeting series. Some of these have a content which may be of interest to a larger audience. The program of these are announced here.
Contact Magne Haveraaen for more information.
- Monday 2019-11-04 1415-1600,
Stort auditorium, Høyteknologisenteret
Stephen Michell (Convener ISO/IEC JTC1/SC22/WG23 - Programming Language Vulnerabilities, Canadian Standards Association (CSA Group) PM for ICT)
Application Security and Safety Vulnerabilities — A Language-Based Analysis
Our technological world is full of talk about threats to our safety and security from attacks on personal computing devices, our work ICT infrastructure, our commercial transactions, our ICT-enabled homes and cars, our buildings and our critical infrastructure. The concerns are well-founded as the threats are very real and serious.This talk examines the role of vulnerabilities in applications that come from mistakes made by programmers and by programming language designers that permit many of the application vulnerabilities to exist and to persist, and discusses how such vulnerabilities can be avoided through programming language selection, language use, coding standards, and language evolution.
The International Programming Language Vulnerabilities Working Group, ISO/IEC/JTC 1/SC 22/WG 23, publishes a series of International Technical Reports ISO/IEC TR 24772-x Programming Languages – Guidance to avoiding vulnerabilities in programming languages. The first TR, TR 24772-1 identifies 65 vulnerabilities that have their roots in the design and use of programming languages plus 34 design vulnerabilities that originate in the design and use of the application environment. It also provides concrete guidance on avoiding the documented vulnerabilities. The other TR's (TR 24772-2 Ada, TR 24772-3 C, TR 24772-4 Python, TR 24772-10 C++, etc.) document how the language vulnerabilities are manifested in that individual programming language as well as providing such vulnerabilities in the specific language,
Short bio: Stephen Michell has been a contributing member of the ICT and high integrity ICT communities for more than 30 years. He has concentrated on the safety, security and concurrency aspects of programming and of programming languages such as Ada through most of this time.
Stephen implemented Ada83 on a multiprocessor platform, was a distinguished reviewer for the Ada 9X project, authored the initial “Guidance for the use of Ada in high integrity systems” that became an International Technical Report, helped to develop the Ravenscar Tasking Profile, and developed proposals for (multicore) parallelism in Ada.
Stephen is deeply involved in the standardization of programming languages at the international level, having chaired the Canadian mirror committee to ISO/IEC/JTC 1/SC 22 Programming Languages subcommittee and convening ISO/IEC/JTC 1/SC 22/WG 23 Programming Language Vulnerabilities working group.
Stephen holds a B Mathematics from the University of Waterloo, Canada, and a MSc in Mathematics and Systems Engineering from Carleton University, Canada.
- Friday 2019-04-26 1215-1400,
510N3 (the yoga room, informatikk), Høyteknologisenteret
Paul Meurer (University of Bergen Library, Department of Linguistic, Literary and Aesthetic Studies, University of Bergen, NO):
Querying linguistic databases – corpora and treebanks
In my lecture, I will give an introduction to databases of linguistic structures – corpora and treebanks, and show how such databases can be searched in a linguistically meaningful way.
Corpora are searchable collections of texts that are linguistically annotated, typically with part of speech, lemma and morphology information, and text-related metadata. Treebanks are special corpora of syntactically annotated texts; the name treebank derives from the fact that syntactic annotations often are tree structures (or in some cases more general directed graphs).
From a formal point of view, a corpus is a sequence of tokens (comprising the sequence of words of the texts), with attribute values attached to each one of the tokens, for a given set of attributes. A search language that is best suited to querying such a structure is a regular language; formally, it can be characterized as a regular expression calculus over the alphabet of constraints on corpus positions (i.e., tokens plus attribute values). I will show how such a query language can be efficiently implemented.
In treebanks, the search domain is the sequence of analyzed sentences, which formally are directed graphs. Such structures can be queried using a calculus based on first-order predicate logic. The variables of the calculus are node variables, and a sentence (directed graph) matches a logical form if there is a set of graph nodes on which the form evaluates to true.
To illustrate these two types of query languages, I will demonstrate the corpus tool Corpuscle, and the treebanking infrastructure INESS, both being hosted at the CLARINO Bergen Centre.
Short bio: Paul Meurer is a senior consultant and researcher at the University of Bergen Library and the Department of Linguistic, Literary and Aesthetic Studies. His research interests lie in the fields of theoretical and computational linguistics. Within theoretical linguistics, he has focused on morphology and syntax, and language typology. In computational linguistics, he has contributed to the research and development of language resources and tools in diverse fields, such as morphological and syntactic parsing for Norwegian, Georgian and Abkhaz, treebanking, vizualization, corpus management and search, terminology, and metadata curation.
VilVite auditorium is in VilVite, Thormøhlensgt 51.
Conference room D is in VilVite, Thormøhlensgt 51.
Lille auditorium is in Datablokken, Høyteknologisenteret, Thormøhlensgt 55.
Stort auditorium is in Datablokken, Høyteknologisenteret, Thormøhlensgt 55.