Abstract
Getting high quality XML schemas to avoid or reduce application risks is an important problem in practice, for which some important aspects have yet to be addressed satisfactorily in existing work. In this paper, we propose a tool FlashSchema for high quality XML schema design, which supports both one-pass and interactive schema design and schema recommendation. To the best of our knowledge, no other existing tools support interactive schema design and schema recommendation. One salient feature of our work is the design of algorithms to infer k-occurrence interleaving regular expressions, which are not only more powerful in model capacity, but also more efficient. Additionally, such algorithms form the basis of our interactive schema design. The other feature is that, starting from large-scale schema data that we have harvested from the Web, we devise a new solution for type inference, as well as propose schema recommendation for schema design. Finally, we conduct a series of experiments on two XML datasets, comparing with 9 state-of-the-art algorithms and open-source tools in terms of running time, preciseness, and conciseness. Experimental results show that our work achieves the highest level of preciseness and conciseness within only a few seconds. Experimental results and examples also demonstrate the effectiveness of our type inference and schema recommendation methods.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020 |
| Publisher | IEEE Computer Society |
| Pages | 1962-1965 |
| Number of pages | 4 |
| ISBN (Electronic) | 9781728129037 |
| DOIs | |
| Publication status | Published - Apr 2020 |
| Event | 36th IEEE International Conference on Data Engineering, ICDE 2020 - Dallas, United States Duration: 20 Apr 2020 → 24 Apr 2020 |
Publication series
| Name | Proceedings - International Conference on Data Engineering |
|---|---|
| Volume | 2020-April |
| ISSN (Print) | 1084-4627 |
Conference
| Conference | 36th IEEE International Conference on Data Engineering, ICDE 2020 |
|---|---|
| Country/Territory | United States |
| City | Dallas |
| Period | 20/04/20 → 24/04/20 |
Bibliographical note
Publisher Copyright:© 2020 IEEE.
Fingerprint
Dive into the research topics of 'FlashSchema: Achieving high quality XML schemas with powerful inference algorithms and large-scale schema data'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver