ACM Home Page
Please provide us with feedback. Feedback
Tool support for data validation by end-user programmers
Full text PdfPdf (270 KB)
Source
International Conference on Software Engineering archive
Proceedings of the 30th international conference on Software engineering table of contents
Leipzig, Germany
SESSION: Validation table of contents
Pages 867-870  
Year of Publication: 2008
ISBN:978-1-60558-079-1
Authors
Christopher Scaffidi  Carnegie Mellon University, Pittsburgh, PA, USA
Brad Myers  Carnegie Mellon University, Pittsburgh, PA, USA
Mary Shaw  Carnegie Mellon University, Pittsburgh, PA, USA
Sponsors
ACM: Association for Computing Machinery
SIGSOFT: ACM Special Interest Group on Software Engineering
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 22,   Downloads (12 Months): 68,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1368088.1368226
What is a DOI?

ABSTRACT

End-user programming tools for creating spreadsheets and webforms offer no data types except "string" for storing many kinds of data, such as person names and street addresses. Consequently, these tools cannot automatically validate these data.

To address this problem, we have developed a new userextensible model for string-like data. Each "tope" in this model is a user-defined abstraction that guides the interpretation of strings as a particular kind of data, such as a mailing address. Specifically, each tope implementation contains software functions for recognizing and reformatting that tope's kind of data.

With our tools, end-user programmers define new topes and associate them with fields in spreadsheets, webforms, and other programs. This makes it possible at runtime to distinguish between invalid data, valid data, and questionable data that could be valid or invalid. Once identified, questionable and/or invalid data can be double-checked and possibly corrected, thereby increasing the overall reliability of the data.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Fisher II, M., and Rothermel, G. The EUSES Spreadsheet Corpus: A Shared Resource for Supporting Experimentation with Spreadsheet Dependability Mechanisms. Tech. Rpt. 04-12-03, University of Nebraska?Lincoln, 2004.
 
3
Internet Engineering Task Force. RFC 2821: Simple Mail Transfer Protocol, http://tools.ietf.org/rfc/rfc2821.txt
4
 
5
 
6
7
 
8
9
 
10
Scaffidi, C., Myers, B., and Shaw, M. The Topes Format Editor and Parser. Technical Report CMU-ISRI-07-104 / CMUHCII-07-100, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 2007.
11
 
12
Scaffidi, C., Shaw, M. Accommodating Data Heterogeneity in ULS Systems. 2nd Intl. Workshop on Ultra-Large-Scale Software-Intensive Systems, at the 30th Intl. Conf. Software Engineering, to appear.
 
13
 
14
Scaffidi, C. Unsupervised Inference of Data Formats in Human-Readable Notation. Proc. 9th Intl. Conf. Enterprise Integration Systems ? HCI Volume, 2007, 236--241.

Collaborative Colleagues:
Christopher Scaffidi: colleagues
Brad Myers: colleagues
Mary Shaw: colleagues