text file is loaded using another text editor; such as Vim; the text is displayed as shown in
Figure 10…4。 As you can see; Vim has loaded the text file without any formatting errors。
■Note Vim is available from http://vim。org。 It is a vi…derived clone that can be used on
Windows systems。
Figure 10…4。 Vim loads the text file in a nicely formatted display。
The real pressing problem lies in the structure of the data; which is illustrated in Figure 10…5。
Here; the data has new formatting; with extra columns; and the first column is not always in
the proper data format。 And to make matters worse; the badly formatted data has repeating
information。
The challenge of the application is to read the stream and fix all of the problems。 This requires
a thorough understanding of string processing and the different ways that text can be stored;
as discussed in Chapter 3。 When you are processing data streams; you need to be aware of the
format of the data stream。 In this example; we are processing ASCII text; and thus will be manipu
lating bits according to the rules of the ASCII lookup table。
Whitespace characters are special characters in the text lookup table。 They are associated
with numbers; but their representation is in the form of an action that the user can see。 For
example; the character between single quotation marks (" ") is a space; the character t is a tab;
and the character n is a newline。 The reason Notepad does not format the lottery text file nicely
(Figure 10…3) is because of the whitespace characters used to indicate a newline。 In Figure 10…6; the
highlighted buffer entry 0A is the hexadecimal character that indicates a linefeed; or newline; in
the lottery text file。
…………………………………………………………Page 284……………………………………………………………
262 CH AP T E R 1 0 ■ L E A R N I N G A B OU T P E R S IS TE N CE
Figure 10…5。 Structural problems of this data stream
Figure 10…6。 Newline character used in lotto。txt
Figure 10…7 is a file created by Notepad。 Notepad expects not a single whitespace character;
but two whitespace characters to indicate a newline: 0D and 0A。
…………………………………………………………Page 285……………………………………………………………
CH A PT E R 1 0 ■ L E A R N I N G A B O U T P E R S IS T E N CE 263
Figure 10…7。 Newline characters used by Notepad
Deciphering the Format
The echo has served its purpose of providing a way to develop an application in a top…down
manner。 The next step is to remove the echo code and start writing the code that will fix the
data stream。
Fixing the data stream is not a trivial undertaking; because you are yet again faced with a
state problem。 You don’t want to fix one part of the stream; only to end up with a problem in
another part of the stream。 Thus; you need to incrementally fix the stream and make sure at
each step that there are no ramifications。
The first step is to break the data stream into individual fields (each value in a column is a
field in this case)。 In Figure 10…5; the data stream had two parts; where the upper part seemed
to have a single space between the numbers and the lower part had the amount of space neces
sary to align the numbers。 The difference between the upper and lower parts is the whitespace
characters used。 So; the first step will be to clean up the whitespace。
The following is the code that reads the buffer; splits it up; and reassembles the content
into a new buffer。 The code is intermediate code that adds special bracket markers to indicate
what the text contains。
Imports System。IO
Imports System。Text
" TODO: Fix up this class
Public Class LottoTicketProcessor : Implements IProcessor
Public Function Process(ByVal input As String) As String
Implements IProcessor。Process
Dim reader As TextReader = New StringReader(input)
Dim retval As New StringBuilder()
…………………………………………………………Page 286……………………………………………………………
264 CH AP T E R 1 0 ■ L E A R N I N G A B OU T P E R S IS TE N CE
Do While reader。Peek() …1
Dim splitUpText As String() = _
reader。ReadLine。Split(New Char() {〃 〃c; ControlChars。Tab})
Dim c1 As Integer
For c1 = 0 To splitUpText。Length 1
retval。Append((〃(〃 & splitUpText(c1) & 〃)〃))
Next
retval。Append(ControlChars。NewLine)
Loop
Return retval。ToString()
End Function
End Class
In the implementation of Process(); the text will be parsed line by line。 Then each line
is split into the individual fields。 You could write the parsing routines yourself; but to parse a
buffer line by line; it is more efficient to use StringReader。 StringReader accepts the string to
parse and is then assigned to a TextReader interface instance。
As each line of text is parsed; the most efficient approach to building a buffer is to use
StringBuilder。 You could keep appending data to the string; but if you do that too often the
application’s performance will suffer。
The String type is an immutable type; which means once an object is initialized; you
cannot change the state of the object。 The advantage of immutable types is that they increase
the speed of your application; because code can assume once an object has been assigned; it
will never change。 The downside is that once an object is assigned; to modify the object state
even slightly; you must instantiate a new object; which would be the case if we used the = and
小说推荐
- oracle从入门到精通(PDF格式)
- -Page 1-Oracle 从入门到精通-Page 2-资源来自网络,仅供学习 Oracle 从入门到精通一、SQL 8
- 最新章:第37章
- C语言游戏编程从入门到精通(PDF格式)
- -Page 1-Page 2-Page 3-Page 4-Page 5-Page 6-Page 7-Page 8-Page 9-Page 10-Page 11-Page 12-Page 13-Page 14
- 最新章:第4章
- Java编程思想第4版[中文版](PDF格式)
- -Page 1-Page 2《Thinking In Java》中文版作者:Bruce Eckel主页:http/BruceEckel.编译:Trans Bot主页:http/memberease~transbot致谢-献给那些直到现在仍在孜孜不倦创造下一代计算机语言的人们!指导您利用万维网的语言进
- 最新章:第295章
- 深入浅出MFC第2版(PDF格式)
- -Page 1-Page 2-山高月小山高月小 水落石出水落石出山高月小山高月小 水落石出水落石出-Page 3-深入淺出MFC(第版 使用Visual C 5.0 MFC 4.2)Dissecting MFC(Second Edition Using Visual C 5.0 MFC 4.2)侯俊
- 最新章:第309章
- VC语言6.0程序设计从入门到精通
- -Page 1-Visual C 6.0 程序设计从入门到精通求是科技 王正军 编著
- 最新章:第136章
- SQL 21日自学通(V3.0)(PDF格式)
- -Page 1-SQL 21 日自学通(V1.0 翻译人 笨猪目录目录 1译者的话 14第一周概貌 16从这里开始 16
- 最新章:第170章
- 2008年青年文摘精编版
- 作者:中国青年出版社“初恋”的惩罚.作者:凡 凡 文章来源《真情》2005年第4期 点击数:6608 更新时间:2005-6-5过了年,我就十八岁了。离高考只剩下四个多月了。这一段,班里的男女生相互间递纸条、写情书、约会等地下活动慢慢的多了起来。我这个“尖子生”也突然感到了不安、慌乱,并且自责。不知
- 最新章:第230章
- JMS简明教程(PDF格式)
- -Page 1-JMS1.1规范中文版卫建军2007‐11‐22-Page 2
- 最新章:第28章
- SQL语言艺术(PDF格式)
- -Page 1-SQLSSQQLL语言艺术内容介绍本书分为12章,每一章包含许多原则或准则,并通过举例的方式对原则进行解释说明。这些例子大多来自于实际案例,对九种SQL经典查询场景以及其性能影响讨论,非常便于实践,为你数据库应用维护人员阅读。资深 SQL 专家 Stéphane Faroult倾力打
- 最新章:第27章