Going on from last 2 posts,
PowerShell : Can you do that less cryptic ?
about finding missing elements in a sequence, inspired by a Hey Scripting guy, article : PowerShell : How can I tell witch numbers are missing from a sequence of numbers found in a text file ? , and the Sort problem mentioned.
I came to another nice challenge, creating a test file for it, that is random and with some missing numbers,
I made a remark about speed, (edit 3) in the second post, as on IRC we where comparing other methods and speed differences between the Methods used, you can find the examples at the end of the second post.
But with so little numbers (10) in the original file, it is hard to test,
So I made up some PowerShell commands to make a bigger testfile for this, for example to test the speed of the filereading and the sort, so if you past this code into the PowerShell console It will create a testfile of 1900 numbers from 1 to 2000 in Random order and with 100 missing numbers :
Below the Code you will find the results when I did paste this code into my Console with some remarks to explain what is going on here :
And here the results of this :
PoSH># Write the test Numbers to the file
PoSH>
PoSH>1..2000 | sc c:\powershell\test.txt
PoSH>
PoSH># Read them back in an Array
PoSH>
PoSH>[collections.arraylist]$a=[io.file]::ReadAllLines('c:\powershell\test.txt')
PoSH>
PoSH># put them in random order and remove 100 to create the gaps
PoSH>
PoSH>$r = ($a.count - 100)..1 |% {$R = new-object random}{$R.next(0,$a.count) |%{$a[$_];$a.removeat($_)}}
PoSH>
This last line is the most interesting line where the magic happens, but the start of this "magic" is in the line before already, not the method of reading the file but the : [collections.arraylist]$a that will cast the result into an arraylist, In an arraylist you can add and remove Items, we make use of that in the next line.
$r will take the results, as the $a array will be empty at the end of the command, next part is :
($a.count - 100)..1 this will make a range from 1900 to 1, as the first line created a file with 2000 lines, but we want 100 less, if you just want to randomize the list just remove the -100 and all the numbers will stay and the list will only be randomized.
Next |% The Foreach, then the first scriptblock, {$R = new-object random} , this will generate a new random generator for more info see, Thow Dices in MSH , and 2 GUI games I made before that use it : MSH Minesweeper GUI game ,MSH Concentration (scripting games part 4) ,
As this scriptblock is followed by another one the Foreach will treat this as a Begin Block and only execute this once, then will execute the next scriptblock as a process block, $R.next(0,$a.count) |%
This will create a random number from 0 to 1999 , and that is pipelined to the next scriptblock, note that the value on the pipeline is not used here, it just is used to determine the times to loop.
The random number is now passed on to the next scriptblock, $a[$_];$a.removeat($_) it takes the random number from the pipeline ($_), and outputs the item that is at that place in the Arraylist $a{$_}, next it will remove the Item that it did output from the Arraylist $a.removeat($_) that is where the base of the magic is hidden,
what, that's all remove an Item ?, Yep ;-),
As next, the loop will go on to the second pass, and will come here again $R.next(0,$a.count) but as we did remove the item that we did output, this will be a number from 0 to 1998,
So we will never output a double item, as we use the random number just as an Index, the trick with the 100 numbers to leave out is also based on that we just miss the last 100 numbers that remain AFTER we picked the numbers at random and that we did put in the $R array already:
Lets list the first 10 to show that this did work :
PoSH>$r[1..10]
61
1343
1108
1527
762
1039
277
295
475
1155
Now we can do out first test with the randomized list,
PoSH># Test the Times for sorting
PoSH>
PoSH>(measure-command {$r | sort {[int]$_}}).TotalMilliseconds
1553.9698
PoSH>(measure-command {[array]::sort($r)}).TotalMilliseconds
5.7518
Hmm, we already see some difference, but for testing our solutions for the question, we also want to test the reading of the file so we write the new list back to the file :
PoSH># Write the Randomized numbers back to the file
PoSH>
PoSH>$r | sc c:\powershell\test.txt
PoSH>
Now that we are here, we have created our testfile, and we can go on test the complete solution against a 1900 records random file with 100 numbers missing (I removed some of the output, but you can see the numbers are random missing :
PoSH>[int[]]$a=[io.file]::ReadAllLines('c:\powershell\test.txt')
PoSH>[array]::sort($a)
PoSH>$a |% {$i = 1}{while ($i -lt $_){$i;$i++};$i++}
6
23
29
61
73
129
207
234
241
So Now we can go on and compare the speed of some different solutions for the question against a bigger file
PoSH># Test the complete Command
PoSH>
PoSH>(Measure-Command {gc test.txt | sort {[int]$_} |% {$i = 1}{while ($i -lt $_){$i;$i++};$i++}}).TotalMilliseconds
1814.0741
PoSH>
PoSH>(Measure-Command {
>> [int[]]$a=[io.file]::ReadAllLines('c:\powershell\test.txt')
>> [array]::sort($a)
>> $a |% {$i = 1}{while ($i -lt $_){$i;$i++};$i++}
>> }).TotalMilliseconds
>>
307.5865
As you can see there can be some differences here ;-)
If you like to do some more testing, also take a look in the PowerShell : Can you do that less cryptic ? post for some more different ways to solve the question
At the end of the post the 2 examples using Diff (compare-Object), and 1 using for to solve it, you can also compare them for speed
Enjoy your testing and timing ,
If you have other any interesting results (in speed or other ways) or other methods to handle the missing numbers problem, or randomize a list solution, please leave them in the comments,
Greetings, /\/\o\/\/
Tags : Monad msh PowerShell
October 2005 November 2005 December 2005 January 2006 February 2006 March 2006 April 2006 May 2006 June 2006 July 2006 August 2006 September 2006 October 2006 November 2006 December 2006