Loginskip to content

November 2nd, 2006

Avoid unnecessary copying. Perl loves to copy data

Avoid unnecessary copying. Perl loves to copy data around. When you write a subroutine, say sub bake { my ($temperature, $how_long) = @_; […] it’s convenient to think of its first line as a kind of prototype, but in fact it’s copying data. You may need to rethink this approach when passing huge variables to a subroutine. Caveat: While it’s fine to get your data directly from @_, be aware that @_ is an alias for your original arguments; be sure you don’t modify any of them without documenting the action. Furthermore, named parameter passing works by copying @_ into a hash; you’ll have to come up with another way to recognize nonpositional arguments. Sean M. Burke points out another application of this principle: Once I wrote a program to do some pretty horrific data conversion on an idiosyncratic markup language. Now, line-at-a-time processing was right out the notation was SGML-like, i.e., freeform with its white space. Most or all of the manipulation was of the form $lexicon =~ s{(.*?)} {handle_subhead($1)}ieg; The problem was that $lexicon was quite large (250+ KB), and applying lots and lots and LOTS of such regex search and replace operations took forever! Why? Because all the modifications of a large scalar involved copying it each time. One day it occurred to me that no replacement operation involved crossing entry boundaries (i.e., record boundaries), so I changed the program to chop the lexicon into entries, and then applied the string replacements to each of those via a foreach. Since each entry was only ~2KB, there was MUCH less painful swapping, so the program ran about 50 times faster! 11.3.3 Improving Disk Space Usage Avoid temporary files. If your operating system supports pipes, use them. Even if your operating system doesn’t support them, Perl on your system might; try it.[5] [5] As of version 5.6.0, fork in Perl works on Windows.

Hint: This post is supported by Gama web hosting hrvatska services

Comments are closed.