Photo Gallery Member List Search Calendars FAQ Ticket List Log Out


Error-check a text file

 
Logged in as: Guest
arrSession:exec spGetSession 2,2,47802
 Active Users: There are 0 members and 0 guests.
 Users viewing this topic: none
 

 

 
  
  Printable Version
All Forums >> [Scripting] >> WSH & Client Side VBScript >> Error-check a text file
  Do you like VisualBasicScript.com? Link to us and help spread the word about our forum. Thanks!
Page: [1]
Login
Message << Older Topic   Newer Topic >>
 Error-check a text file - 5/31/2007 7:05:25 PM   
  markmcrobie

 

Posts: 314
Score: 0
Joined: 12/12/2006
Status: offline
I have a function that, given a delimited text file with fields delimited by a single space:

1) Reads in a text file and splits it into an array on the carriage return

2) Converts the 1st space to a comma and then all others spaces to a dash/hyphen

3) Writes the newly formatted entries back to the text file

This relies completely on the text file being formatted as per what I said in my first line above.

What I want to do is check the text file before doing stage 1) above, and proceed only if it's in the right format.  Presently I simply show the user a MsgBox telling them to make sure it's formatted correctly, but it doesn't actually check the text file.

The format should always be:

<number> <field1> <field2>, etc

<number> will always be a number, between 1 and (usually) 5 digits (it could be 6 eventually)
<fieldx> will always be alphanumeric characters.

For example:

22814 850 001 1A1
221 654 003 3B1R
14545 1 DX 146 004 4C1

I was hoping there might be a RegEx or similar that I could use to test each line - if it found any non-alphanumeric characters, or if it found more than space next to each other, then it could halt the script and inform the user there's a problem with the text file.
 
 
Post #: 1
 
 RE: Error-check a text file - 6/1/2007 12:12:00 AM   
  SAPIENScripter


Posts: 276
Score: 2
Joined: 11/1/2006
From: SAPIEN Technologies
Status: offline
I think regular expressions are the only way you can do this, but probably only if there is alway the same number of columns for every line of the file.  If row one as 6 columns and row two has 5 columns, I don't see how you could have any standardized method.  Assuming this is not an issue, you could try writing one long pattern taking everything into account including spaces (\s) but that might be overwhelming.  An easier approach might be to take each line and split it into an array on the space character.  Then use a shorter regex pattern where you can use ^ and $ to match the entire string.  You should know that the first string block needs to match regexA, the second RegexB and so on.  This doesn't help with spaces, but maybe that's ok.  I might just create a new file using the Write method to create a known good file with single spaces.  One alternative might be to make a first pass on the file line by line using a broad regex pattern that looks for the right number of columns separated by a single space.

Good luck with this.  You've got your hands full.

_____________________________

Jeffery Hicks
Windows PowerShell MVP
SAPIEN Technologies - Scripting, Simplified. www.SAPIEN.com

Follow Me: http://www.twitter.com/JeffHicks

(in reply to markmcrobie)
 
 
Post #: 2
 
 RE: Error-check a text file - 6/1/2007 12:33:47 AM   
  markmcrobie

 

Posts: 314
Score: 0
Joined: 12/12/2006
Status: offline
Number of column will be 4 90% of the time, but may be 6 sometimes.

I thought I could do it line by line and maybe check each line for any non-alphanumeric characters (commas, dashes, etc).  If it finds any, there's a problem straight away.  If it doesn't find any, then I thought I could maybe check for spaces, and if found check and see if the next character is also a space.  If so, remove all but the first space.

(in reply to SAPIENScripter)
 
 
Post #: 3
 
 RE: Error-check a text file - 6/1/2007 12:56:01 AM   
  dm_4ever


Posts: 2724
Score: 46
Joined: 6/29/2006
From: Orange County, California
Status: offline
See how this works for your purpose...


      

_____________________________

dm_4ever

My philosophy: K.I.S.S - Keep It Simple Stupid
Read Me: http://www.visualbasicscript.com/m_24727/tm.htm
Frequently Asked Stuff: http://www.visualbasicscript.com/m_47117/tm.htm

(in reply to markmcrobie)
 
 
Post #: 4
 
 RE: Error-check a text file - 6/1/2007 12:58:36 AM   
  ebgreen


Posts: 5250
Score: 31
Joined: 7/12/2005
Status: offline
You can do all of that with regular expressions. Pick one of the tests and we can work with you to figure out the pattern.

_____________________________

"... when you are good and crazy, oooh, oooh, oooh, the sky is the limit!" - The Tick
Goog places to start:http://www.visualbasicscript.com/m_24727/tm.htm
http://www.visualbasicscript.com/m_47117/tm.htm

(in reply to markmcrobie)
 
 
Post #: 5
 
 RE: Error-check a text file - 6/3/2007 7:16:29 PM   
  markmcrobie

 

Posts: 314
Score: 0
Joined: 12/12/2006
Status: offline
dm, I've spotted some flaws in your RegEx:

I modified your code slightly so I got a Echo back for those strings in your array that DON'T match your RegEx, and it was these ones:

ABC 123 DX 034
1234567 392
221 654 003 3$B1R
1

Of those I'm not sure why the first 2 don't match

(in reply to ebgreen)
 
 
Post #: 6
 
 RE: Error-check a text file - 6/3/2007 7:18:39 PM   
  markmcrobie

 

Posts: 314
Score: 0
Joined: 12/12/2006
Status: offline
Oops, just realised it's because the 1st "column" doesn't contain digits only of length 1-6 characters

(in reply to markmcrobie)
 
 
Post #: 7
 
 RE: Error-check a text file - 6/3/2007 7:24:05 PM   
  markmcrobie

 

Posts: 314
Score: 0
Joined: 12/12/2006
Status: offline
dm, after some more testing your expression seems to work great.

Could you explain it to me, bit by bit - I'm desperate to learn more about RegExs.

^\d{1,6}\s([0-9A-Z]+\s)*[0-9A-Z]+$

I'm unsure what the first character does (^)

I guess \d{1,6} looks for digits only, 1 to 6 chars in length

I'm unsure of the next character (-)

I guess \s means space

Unsure of the (

I guess [0-9A-Z] looks for alphanumeric characters, and i guess the + means any continuous sequence of these

Unsure of the )*

Again, I guess [0-9A-Z] looks for alphanumeric characters, and i guess the + means any continuous sequence of these

Unsure of the $

< Message edited by markmcrobie -- 6/3/2007 7:29:27 PM >

(in reply to markmcrobie)
 
 
Post #: 8
 
 RE: Error-check a text file - 6/3/2007 8:57:56 PM   
  ehvbs

 

Posts: 2223
Score: 50
Joined: 6/22/2005
From: Germany
Status: offline
Hi markmcrobie,

I'm unsure what the first character does (^)

^means: start of string; so "^a" will match "ab" but not "ba", while "a" will match both
In character class definitions [^...] means: Not (match characters not in list)

I guess \d{1,6} looks for digits only, 1 to 6 chars in length

  yes: \d means Digit; { <minimum/at least>, <maximum, at most> }

I'm unsure of the next character (-)

  out of character class definitions [] - just means a literal -

I guess \s means space

  yes/no: \s means whitespace (including tabs)

Unsure of the (

  plain () are used to capture the contained (part of the) match for further use (\1.., submatches); at the
same time they bracket the scope of operators like * or + or {mi,ma}

I guess [0-9A-Z] looks for alphanumeric characters, and i guess the + means any continuous sequence of these

  yes: a character class (set) definition matches any character in the list at the given position (or negated: [^...] none)
* = zero ore more, + 1 or more

Unsure of the )*

  closing the scope/capture opened by (; the resulting part may occur zero or more times (*)

Again, I guess [0-9A-Z] looks for alphanumeric characters, and i guess the + means any continuous sequence of these

  yes

Unsure of the $

$ means "end of string" (cf ^: start of string)


The VBScript Docs contain an "Introduction to RegExps" chapter; there is an interactive RegExps tester posted
by mikesok (?) in the "Post your Script" forum.

Have fun working with RegExps!

ehvbs

(in reply to markmcrobie)
 
 
Post #: 9
 
 
 
  

If you found our site useful please link to us <a href="http://www.visualbasicscript.com">VisualBasicScript.com</a>.
All Forums >> [Scripting] >> WSH & Client Side VBScript >> Error-check a text file Page: [1]
Jump to:





New Messages No New Messages
Hot Topic w/ New Messages Hot Topic w/o New Messages
Locked w/ New Messages Locked w/o New Messages
 Post New Thread
 Reply to Message
 Post New Poll
 Submit Vote
 Delete My Own Post
 Delete My Own Thread
 Rate Posts