Transforming web pages to become standard-compliant through reverse engineering

Benfeng Chen*, Vincent Y. Shen

*Corresponding author for this work

Research output: Chapter in Book/Conference Proceeding/ReportConference Paper published in a bookpeer-review

6 Citations (Scopus)

Abstract

Developing Web pages following established standards can make the information more accessible, their rendering more efficient, and their processing by computer applications easier. Unfortunately, more than 95% of the existing Web pages today are not "valid" in that they do not follow some of the recommendations (standards) of the World Wide Web Consortium (W3C). Fixing any Web page to make it standard-compliant is a major undertaking. There is now an open-source tool called HTML Tidy which will attempt to fix the invalid HTML code automatically. However, Tidy often changes the Web page's appearance after processing. It is not an effective tool to transform existing Web pages to make them standard-compliant. In this paper we report the design and implementation of PURE, a tool that cleans up an HTML document through reverse engineering. PURE starts with the rendering result of a given Web page and generates valid HTML code and CSS automatically to produce the same appearance. It is found to be effective for many existing Web pages. A prototype is now available for public testing and comments.

Original languageEnglish
Title of host publicationACM International Conference Proceeding Series - Proceedings of the 2006 International Cross-disciplinary Workshop on Web Accessibility, W4A - Building the Mobile Web
Subtitle of host publicationRediscovering Accessibility
Pages14-22
Number of pages9
DOIs
Publication statusPublished - 2006
Event2006 International Cross-disciplinary Workshop on Web Accessibility, W4A - Building the Mobile Web: Rediscovering Accessibility - Edinburgh, United Kingdom
Duration: 22 May 200622 May 2006

Publication series

NameACM International Conference Proceeding Series
Volume134

Conference

Conference2006 International Cross-disciplinary Workshop on Web Accessibility, W4A - Building the Mobile Web: Rediscovering Accessibility
Country/TerritoryUnited Kingdom
CityEdinburgh
Period22/05/0622/05/06

Keywords

  • Browser
  • Cascade style sheets
  • HTML
  • HTML tidy
  • Rendering engine
  • W3C recommendations
  • Web page

Fingerprint

Dive into the research topics of 'Transforming web pages to become standard-compliant through reverse engineering'. Together they form a unique fingerprint.

Cite this