Rewriting Perl Code for Raku III: The Sorceror

Last week, we started testing, learned how to create proper Raku classes, and the basics of functions. This time we’ll take a closer look at functions, arguments, and make some decisions about the API. And maybe while writing this I’ll argue myself out of a decision. It’s happened before.

One good thing about writing about a module is that you can slip into a certain mindset. For instance, right now I’m thinking a few paragraphs ahead, wondering how to explain why I changed the API from Perl 5 references to regular Raku types.

It’s at odds with some of the principles I laid down at the start, which states that I should have minimal changes in the API from Perl to Raku. In Perl 5, you would create the “filesystem root” object like so:

my $root = OLE::Storage_Lite::PPS::Root->new(
  [ 0, 0, 0, 25, 1, 100 ],
  [ 0, 0, 0, 25, 1, 100 ],
  [ $workbook, $page_1, $sheet_1 ]

with a bunch of references to lists. By all rights, and the principles I set up earlier, the Raku equivalent should be almost exactly the same:

my $root =
  [ 0, 0, 0, 25, 1, 100 ],
  [ 0, 0, 0, 25, 1, 100 ],
  [ $workbook, $page_1, $sheet_1 ]

In fact, all I did was copy and change two characters, specifically the Perl ‘->’ to the Raku ‘.’ operator. Clean, and very simple. And I think what I’ll do is actually just change the code back to using the Perl reference, at least in the API. Dereferencing it will be just a few lines, and I’ll have to change it in the tests as well, but I think the pain will be worthwhile.

This way I don’t have to field questions like “Why did you end up potentially breaking old code?” during talks. See, speaking at conferences about your code really can be a useful motivator!

I’d like a formal argument, please

So, I think I’ve settled on Perl-style formal references, at least for the current iteration. There are actually better ways to do this, but I’ll leave that for the proper Raku version. For right now, quick-n-dirty is the name of the game.

Moving on, we see an important method in the original Perl code, saving an object to disk.

sub save($$;$$) {
  my($oThis, $sFile, $bNoAs, $rhInfo) = @_;
  #0.Initial Setting for saving
  $rhInfo = {} unless($rhInfo);
  # ..

As I’ve mentioned before, OLE::Storage_Lite has been around for a long, long time. And it’s obvious here. Function prototypes (not signatures, which are a different kettle of fish) and the use of ‘$oThis’ instead of the more conventional ‘$self’.

Being prototypical

Prototypes were originally meant as a way to save you from having to write checks in your code. Theoretically, if your function was called sub save($$) and you tried to call it with save($fh) you would get an error, because the ‘$$’ means the subroutine took two arguments, and you gave it just one.

But it also predated objects (yes, Virginia, objects in Perl haven’t been around all that long.) and they could have unforeseen side effects. So they were a fad for a while, but quickly faded out of existence.

These days they’re a reason for a more experienced Perl hacker to take the junior aside and explain quietly why we don’t use those anymore, and point them to some modern references, like Modern Perl (not an affiliate link, yet.)

Let’s at least partially convert that to Raku, like so:

method save($sFile, $bNoAs, $rhInfo) {
  #0.Initial Setting for saving
  $rhInfo = {} unless($rhInfo);
  # ..

The ‘$oThis’ means that this is a method call, so instead of writing sub save( $oThis, ... ) we can rewrite it to a method and gain ‘self’ instead of the arbitrary variable ‘$oThis’. Of course we do have to do a search-and-replace on ‘$oThis’ with ‘self’, but that’s relatively simple. More complex is what to do with the ‘;’ in the original prototype.

Having options

It’s worth pointing out that OLE::Storage_Lite is taken at least in part from another (larger) module, OLE::Storage. This means that the internal code is redundant in a few places. Raku would let us rewrite what we have as:

method save($sFile, $bNoAs, $rhInfo = {}) {
  #0.Initial Setting for saving
  # ..

making $rhInfo an optional variable with a default value. Now, this is a pretty common pattern for a recursive method, so I did a bit of digging. Namely I grep’ed for ‘save’ in the original (all-in-one) module, and found no recursive calls to it.

Debugging both sides now

This is also where the test suite I wrote earlier comes in handy, as it actually exercises the ‘save’ method. So I added a quick debugging message warn "Saving $rhInfo"; to my local copy of the code, and ran the test suite. Seeing just one ‘Saving …’ message in my test output convinced me it wasn’t recursive. So now the code just looks like:

method save($sFile, $bNoAs) {
  #0.Initial Setting for saving
  my %hInfo;
  # ..

Also, since $rhInfo is created in this method, there’s no reason to leave it as a reference. So the initial ‘r’ goes away, and we have left just ‘%hInfo’. It may get passed in to other methods, but Raku lets us pass hashes and arrays as ordinary variable types, so I’ll take advantage of that.

To be fair, leaving it as a reference would have saved me a bit of typing, but I’d already kind of decided that at least internally I’d try to use Raku types and calling conventions, and that left me with the choice of how to pass variables around.

Having options

Finally, there’s the question of what to do with the semicolon. Remember at the start, the function prototype was ‘($$;$$)’ which meant $oThis and $sFile were before the semicolon, and $bData and $rhInfo were after. I can now reveal that ‘;’ in a Perl prototype means that whatever appears afterward is optional.

True to Raku’s nature, I can account for this in at least two ways. One way would be to decide that $bData is always there and just has a default value, probably 0. That would look like method save( $sFile, $bData = 0 ). But the documentation puts $bData in square brackets, indicating that it’s optional.

Raku has an alternate syntax to indicate if a variable is optional, which looks like method save( $sFile, $bData? ). I think this method is better than the alternative syntax because it states clearly that $bData is optional. Both methods work, I just happen to like the ‘?’ modifier.

Waiting for Huffman

Moving on, we have this wonderful line of code:

$rhInfo->{_BIG_BLOCK_SIZE}  = 2**
                  _adjust2($rhInfo->{_BIG_BLOCK_SIZE})  : 9);

When I was translating this initially, I was in something of a drone mindset, not truly thinking about what I was doing. I’d copied the $rhInfo variable into the method signature and just kept on writing. I ended up with a statement that I eventually shortened quite a bit.

$rhInfo.<_BIG_BLOCK_SIZE> = 2**
  ( $rhInfo.<_BIG_BLOCK_SIZE> ??
    _adjust2( $rhInfo.<_BIG_BLOCK_SIZE> ) !!
                                        9 );

The ‘.’ after $rhInfo indicates we’re dealing with a reference, and the <..> notation is now how barewords look inside hashes. The old {_BIG_BLOCK_SIZE} is still there, but it’s pronounced {‘_BIG_BLOCK_SIZE’}. A lot of people use the {‘..’} in Perl already so it’s not a big change, and it actually simplifies the backend enormously.

Also, at the start Larry and Damian pulled statistics on Perl code from CPAN and other repositories. They were looking for operator frequencies, among other things. Frequently used operators like qw() and -> got even shorter in Raku.

Others, like the ternary operator, weren’t so lucky. It got longer, and stretched to ‘?? .. !!’. So this is one place where the code will look a little funky. Maybe one day I’ll write a slang to fix it, but back to work.

Trimming the verge

Earlier I mentioned that this module was trimmed down from a much larger full OLE reader/writer. This was the first place that became evident. Since $rhInfo is now called %hInfo and initialized inside the method, this statement deserves to be looked at a little closer.

my %hInfo;
%hInfo<_BIG_BLOCK_SIZE> = 2**
  ( %hInfo<_BIG_BLOCK_SIZE> ??
    _adjust2( %hInfo<_BIG_BLOCK_SIZE> ) !!
                                        9 );

After replacing $rhInfo with %hInfo this is what I got. But since %hInfo is defined just above, the test %hInfo<_BIG_BLOCK_SIZE> will never be true, so this entire block can be reduced to:

my %hInfo = _BIG_BLOCK_SIZE => 2**9;

While I’m here I’ll delete _adjust2(). No code pathway uses it, so out it goes. I’ll restore it if I have to, but right now I want the test scripts to pass, and that’s it. I’ve got the original source, and a map from Perl to Raku, and that’s all I need.

Culling yaks from the herd

Where there’s smoke there’s fire, so I stop what I’m doing and grep out every ‘sub X’ call in the source, putting it in a scratch monkey. Then I go through the source (which I have below the new Raku source, deleting lines as I go) and look for methods that aren’t used, like adjust2(). I delete each of these methods with extreme prejudice, because each line of code I don’t see is one I don’t have to translate.

Checkpoint in git, and now it’s time for a lunch break. Afterwards, I’m getting into the save() method, and see what looks like a new yak to shave. Or a package to translate, to be precise.

  if(ref($sFile) eq 'SCALAR') {
    require IO::Scalar;
    my $oIo = new IO::Scalar $sFile, O_WRONLY;
    $rhInfo->{_FILEH_} = $oIo;
    # ...

In both Raku and Perl, you can create a single method called new( $sFile ) that treats $sFile as either a filename (scalar), file content (scalar reference) or file handle (scalar object.) In Perl, if we wanted to handle filenames, file contents, or file handles differently, we’d have to switch like this, or have different method names.

In Raku, we can handle this differently. In fact I can write the code to save() to a filename, and add save() to a filehandle later with no modifications needed. Above, I briefly touched on the fact that you can write more than one new() method, as long as the two method signatures were distinct.

multi method save( Str $filename ) {...}
multi method save( IO::Handle $fh ) {...}

Raku will let you write two methods called save(), as long as it can tell which one to call at runtime. So, I can call $ '/tmp/test.xlsx' ); or $ $out_filehandle ); and Raku will “dispatch” it to the right save() method automatically.

We call it ‘multiple dispatch’ for just that reason, dispatching a function call to multiple versions of a method. And this means that I can write the first save( Str $filename ) method without worrying about the other methods. I don’t have to add a new if-then branch to the existing code, or modify save() in any way.

I can just write my save() method and ignore the other IO:: types. Also, if someone gets my code later and wants to add a save() method that saves to something I know nothing about, they can write their new save() method without interfering with mine.

In this installment we’ve covered the basics of function and method calls, delved into the ternary operator, removed dead code and learned a little about multiple dispatch. Next time, we’ll open the binary filehandle we created above and delve into the mysteries of pack() and unpack().

I’ll also show you a new (yes, I couldn’t resist) grammar-based version of pack() that should cover the entire Perl gamut of packed types, with a bit of patience and a large enough test suite.

As always, gentle Reader, thank you for your time and attention. If you have any (constructive, please) comments, criticisms or questions, please let me know in the comment section below.

Rewriting Perl Code for Raku II: Electric Boogaloo

Picking up from Part One, we’d just finished up rewriting a Perl script into the test suite for the Raku translation of OLE::Storage_Lite. Raku programming is made easier by having lots of tools, but Microsoft documents aren’t yet well-represented in the Raku ecosystem.

Being able to read/write OLE allows us to create a whole range of Microsoft documents (at least where they’re documented.) Because of its day-to-day use, we’re focusing on Excel here. Many businesses still rely on Excel for their day-to-day task management, time tracking and home-grown processes.

I’ve been known to wax philosophical about this after a few Westmalle Tripels at various conferences. Now is the time for doing something about it. Here’s what our burgeoning test suite looked like, at least in part. The current code is in raku-OLE-Storage_Lite over on I’ve gotten rid of most of the Perl 5 test skeleton, but the essence remains.

use v6;
use Test;
use OLE::Storage_Lite;

plan 1;

my $oDt =
  ( 0, 0, 16, 4, 10, 100 ), # 2000/11/4 16:00:00:0000
  ( $oWk, $oDir )
subtest 'Root', {
  isa-ok $oDt, 'OLE::Storage_Lite::PPS::Root';
  is $oDt.Name, 'Root Entry';
  is-deeply $oDt.Time2nd, [ 0, 0, 16, 4, 10, 100 ];
  # ...

Originally there really weren’t any Perl 5 tests for this module. I’m sure the original author treated the entire module as a black box, and they were happy to be able to run samples/, open the new test.xls in Excel, and when it actually read the file, treat that as ‘ok 1’, push it to CPAN and call it a day.

Testing, testing

That’s wonderful, and I may eventually adopt that methodology. For the moment, the lack of a test suite leaves me a bit unsatisfied. I suppose I could treat the entire module as a black box and fix the translated version line-by-line as I go through it. I’ll have to do that eventually (spoiler alert: That’s actually where I am – I’m writing these pieces a bit after the fact.)

That leaves me with the question of what to test, and what the quickest way to get there is. The individual Directory, Root and File objects are exposed to the user, and are part of the public API. So it makes some sense to create an object, look at the internals, and do my best to match that in Raku.

I Think I’m A Clone Now

There’s always two [implementations] of me standing around… I don’t want to get sidetracked by reading the entire OLE spec. I might start to realize what a huge job this really is, and abandon ship. So, I’m going to limit myself to the following:

Create a narrowly defined 1:1 clone of the exact source of OLE::Storage_Lite in Perl 5. The objects will act exactly like the Perl 5 version, as will the API. This way I don’t have to think about what the API should do, how it should look in Raku, how the objects get laid out, anything fancy. All I need to worry about is:

  1. When I write warn $oDt.raku, does the output look the same as use YAML; warn Dump($oDt); in Perl 5?
  2. When I write the final file to disk, does the Raku code output exactly the same file as the original Perl 5 version?

That’s it. It takes away a lot of possibilities, but it lets me focus on getting the job done, not how things should look. Being able to test how the individual objects look will tell me that the read API works and saves enough data to be able to reconstruct the object in memory.

Conversely, being able to match the binary output tells me that the write API works, so I’ve effectively tested as much as the original module did. Plus I can automate some of the process, especially on the read side.

Lost in Translation

You can check out the current source at raku-OLE-Storage_Lite, and follow along with some of the changes I’ve made. I also made sure to keep a working copy of the original OLE::Storage_Lite Perl 5 module around. My Raku tree right now is very close to Perl 5.

I can insert a debug statement like die "[$iBlockNo] [$sData]\n" in the Perl 5 code, go to the equivalent line in Raku, and expect that when I run the two test suites, that they’ll die in exactly the same way.

This way when they don’t, I can immediately narrow down the problem simply by moving the ‘die’ statements up in the code until they return the same values. The line immediately below the ‘die’ statement will be the culprit.

The Nitty Gritty Perl Band

I’ll mention one thing in passing – the original Perl 5 source code is in a single file containing all of the packages. That’s not Raku style, so I’ve unpacked it into lib/OLE/Storage_Lite/* following the usual style of one Perl 5 class – one file.

So, time to get our hands dirty. The new Raku module won’t compile for quite a while, so we’d better put this into git. I’m also using App::Mi6 to do my development and eventual push to CPAN, so all of that boilerplate is there too.

So, cue the montage scene of the dedicated Raku hacker pounding away at the keyboard, with the occasional break for food and/or adult beverage. Looking over her shoulder, we see a familiar split-screen view, with Perl 5 code on top, and a new Raku file below.

use OLE::Storage_Lite::PPS;
package OLE::Storage_Lite::PPS::Root;
use vars qw($VERSION @ISA);
@ISA = qw(OLE::Storage_Lite::PPS);
use OLE::Storage_Lite::PPS;
unit class OLE::Storage_Lite::PPS::Root is OLE::Storage_Lite::PPS;

Raku has classes where Perl 5 has packages. The ‘unit’ declaration there says that the class declaration takes up the remainder of the file. This is sort of how Perl 5 does it, but gets rid of the ‘1;’ at the end of your package declaration.

It’s also useful for another reason I’m not going to show. Namely that the Perl 5 code is directly below the Raku code, commented out. I’m flipping between vim windows to delete lines as I translate them by hand. So the ‘unit class’ declaration helps in case I accidentally un-comment Perl 5 code – I’ll get big honkin’ warnings when I run the test suite.


(for those of you that remember the module’s release)

Raku borrowed liberally from Perl 5’s Moose OO metamodel, to the point where using Raku will feel very similar. Just drop a few bits of syntactic sugar that Moose needed to work under Perl, and it’ll feel the same.

In this case the ‘is’ does the same job as in Moose, to introduce a parent class. Raku doesn’t need the sugar that Moose sweetens your code with, so you can just say your class ‘is’ a subclass of any other class.

Let’s keep rolling along here, with the next lines of the Perl 5 library:

require Exporter;
use strict;
use IO::File;
use IO::Handle;
use Fcntl;
use vars qw($VERSION @ISA);
@ISA = qw(OLE::Storage_Lite::PPS Exporter);
$VERSION = '0.19';
sub _savePpsSetPnt($$$);
sub _savePpsSetPnt2($$$);
use OLE::Storage_Lite::PPS;
unit class OLE::Storage_Lite::PPS::Root:ver<0.19> is OLE::Storage_Lite::PPS;

Moving along… Okay, ya caught me, ‘:ver<0.19>’ is something new that we should add. Versions are now integrated into classes, so you can check them and even instantiate based on version number.

The module actually doesn’t export anything, so we don’t need Exporter at all. Raku enables ‘strict’ automatically, has IO modules in core, and doesn”t need Fcntl. The forward declarations aren’t needed for Raku, so all that’s left is the module’s version number, which gets added to the class name. You can add other attributes, too.

Making things functional

To keep things simple for me writing the code, and me having to read the code weeks, months or years later, I want as close to a 1:1 relation between Perl 5 and Raku as I can. Another place where this requires an accommodation (but not much of one) is just a few lines down, writing the creation method ‘new’.

sub new ($;$$$) {
    my($sClass, $raTime1st, $raTime2nd, $raChild) = @_;
        # ...

By this point you’ll probably see more of why I say this module is a hard worker. It’s been around a long time, and function prototypes like this are one easy way to tell. Let’s rewrite it in a more modern Perl 5 style before making the jump to Raku, with function signatures.

sub new($sClass, $raTime1st, $raTime2nd, $raChild) {
        # ...

Just drop the old function prototype, and replace it with the variables we need to populate. Well, almost. If you know what a subroutine prototype is, you might think I’m pulling a fast one on you. And you’d be right. Look back at the original Perl 5 code, and you’ll see ‘($;$$$)’ is the prototype.

The ‘;’ separates required variables from optional variables, and we haven’t accounted for that in our Perl 5 code. Since I’m not here to modernize Perl 5 code but convert it to Raku, I’m going to ignore that in Perl 5 and go straight to Raku.

multi method new( @aTime1st?, @aTime2nd?, @aChild? ) {
    Time1st => @aTime1st,
    Time2nd => @aTime2nd,
    Child => @aChild

Under Construction

And there we are. Now, there’s quite a bit to take in, so I’ll take things slow. The first thing you’ll notice is the keyword ‘multi’. In Perl 5, you get to hand-roll your own constructors, so you can make them any way you like. In this case, the author chose to write new($raTime1st, $raTime2nd, $raChild), which is pretty common.

Raku gives me a default ‘new’ method, so I only need to hand-roll constructors when I want. Since I want to keep as close as reasonable to the original API, I’ll write a constructor that takes 3 arguments too. In my case I chose to simplify things just a bit here.

I’ve found over several years of writing Raku code that I rarely use references. In Perl 5 they were pretty much the only way to pass arrays or hashes into a function, because of its propensity to “flatten” arguments.

In Raku, you can still use the Perl 5 style, but formal argument lists are the way to go in my opinion. If you need to pass both an array and a hash to a Raku function, go for it. I encourage that in my tutorial courses, and recommend it to help break students out of their Perl 5 mindset.

This is not to say that there’s anything wrong with Perl 5’s argument list, in fact they’ve taken some ideas from Raku for formal argument lists, and I encourage that. Cross-pollination of ideas should be encouraged, it’s how both languages grow and add new features.

Last week was about the overall module, this week we delved a bit into the OO workings. Next week we’ll talk about references, attributes, and maybe progressive typing.

Rewriting Perl Code for Raku

This time around we’re going to talk about how to rewrite Perl code in Raku. Even in 2019, a lot of the office world revolves around spreadsheets, whether they be Excel, LibreOffice or simple .csv files. Perl 5 has a plethora of modules to do this, a quick search for ‘Spreadsheet’ on MetaCPAN should convince you of that.

The Raku world doesn’t have quite as many modules as you’d expect, though. While it’s been around for a few years, “heavy lifting” modules like Spreadsheet stuff really haven’t come around yet. This involves packing and unpacking binary formats, and in Perl 5 this centered around the pack and unpack builtins, which are relative newcomers to Raku.

But Raku has built-in binary buffers, which take care of most of the need for pack/unpack. The main reason I can see is the OLE storage format. Basically it’s Microsoft’s way of packing a file system into a single data file. And at this point the proverbial yaks start to pile up, and reasonable people say “You know, Excel still accepts .csv files, I know how to build those.”

Enter raku-OLE-Storage_Lite – this is my translation-in-progress from Perl 5 to Raku. As of this writing it can read an entire OLE file (without data) and write a good portion of the sample file – I believe I’ve got maybe two methods left to debug.

Knee deep in yaks

CSV files are easy to write, but they come with their own set of troubles. When you import a .csv file into your Excel app (or LibreOffice, or whatever) you’re faced with a complex dialog asking you how to import your data, and the average user doesn’t want that every time, they just want to open their spreadsheet.

So, it’s time to follow Liz’s lead and rewrite in Raku an existing module. First thing I did was go to Spreadsheet::ParseExcel and see how they did things. Within a few minutes I’d already encountered the first yak. After opening the file, it delegates it to OLE::Storage_Lite, which is much like James Brown, the “hardest-working man in show business”.

It’s still on version 0.19 at the time of writing, but I assure you that’s only because the current maintainer hasn’t updated the version to reflect reality. It may be legacy Perl rough-and-tumble code, but it’s been around for a long time. It wears its battle scars proudly.

It relies heavily on pack and unpack, which are still technically experimental in Raku. The OO and coding style betrays its pre-5.00 origins, and the tests are, well, very pragmatic. “Does it load? Great! Can it convert timestamps internally? Great! Ship it!”

To its credit, there’s a sample directory where you can use to view the contents of the internal filesystem of any OLE file, and a sample writer to create a known-working OLE file. That’ll do as a starting point.

Buckling down

So, reading an Excel spreadsheet means reading an OLE file system. And when I say file system, I’m not kidding. Inside your typical .xlsx file, there’s a small header and a root object. The root object contains “pointers” (really file offsets) to a document object, and inside that are file objects, each with pointers to the different blocks.

This is all intended to reflect the original disk layout, so it looks very much like an NTFS superblock and block layout. The documentation seems to have moved to this page detailing OLE 1.0 and 2.0 formats, I’m not at all certain what the current version has.

How are Excel spreadsheets arranged in here? Worksheets are OLE directories, and inside each worksheet, tabs are individual files. How’s that for a bit of inspiration? Luckily the Root directory, Files and nested Directories are all separate objects, with at least a few common methods aggregated into a superclass.

Legacy Code

This is a long-winded way of saying the module in question is very much legacy code. And, as I want to bring it into the proverbial light, I’ve got to give some issues some thought.

  1. No useful tests, so I’ll have to write those.
  2. How much code do I want to sacrifice?
  3. How much can I save?

Well, I can put off #2 and #3 while writing some tests. Whoa, wait a minute. I don’t have a test file to work with, just some scripts over in sample/. Mumble, mumble, more yaks. Read README, find that will create one, run that.

Great, I’ve got a sample test.xsl file. But given the amount of potential bit-rot it seems prudent to actually make sure that I’ve got a working Excel file before committing a few days (ha!) to getting a module working. Double-click it, launch into Excel’s cloud-serviced app, find that it’s one of those Win10 panes I’ve never figured out how to close, open task-killer, kill that.

Launch LibreOffice which I happen to have lying around – my current project at work is parsing a spreadsheet in Perl 5, which is what inspired this whole workload.

Yep, that parses; looks a bit odd because it’s coming up with a Japanese font, and some arbitrary English text, but it works. Also, looking at the code it generates all three object types – Root, File and Dir, so it’ll exercise the major code paths. Bonus.

Testing, testing

Now I’ve got the makings of a simple test file. The script builds objects individually, so I can run the individual calls, and check that the object’s internals look the way I want.

my $oDt = OLE::Storage_Lite::PPS::Root->new(
  [ ],
  [ 0, 0, 16, 4, 10, 100 ], # 2000/11/4 16:00:00:0000
  [ $oWk, $oDir ]

In Raku, this converts to:

my $oDt =
  ( 0, 0, 16, 4, 10, 100 ), # 2000/11/4 16:00:00:0000
  ( $oWk, $oDir )

I’ve made one change already, to make things simpler for Raku users. In Perl, you have to pass lists as references unless you want to use the new function signatures. In Raku, you can just pass lists as you would ordinarily to your method call.

Using native data types rather than passing references around may seem a bit odd at first to new Raku programmers, but the new variable classes are easier to enforce strong typing on later, when you get used to the language.

Going with the flow

Now we’ve got something we can test, namely making sure that we’ve got a valid OLE Root document. So, before we go ahead with the code, I’ll share a few little things. I know very little about this code, so I want to make sure that I intimately copy each detail of the object at this stage. Later on I might get fancy and replace things with their own object types, but for now, my goal is going to be 1:1 replication.

I tend to like tmux as a shell environment, haven’t really gotten along with UIs. So, keeping in mind that I wanted an absolute 1:1 copy of the original object, I ended up doing this:

  1. Switch to new window, open my copy of ‘samples/’ in vim
  2. Add ‘use YAML; die Dump( $oDt ) just below the line where it gets created
  3. Switch to new window, run the sample script, copy the YAML output
  4. Close the two new windows I created to keep clutter down
  5. Paste the YAML code into the new Raku test.
my $oDt =
  ( 0, 0, 16, 4, 10, 100 ), # 2000/11/4 16:00:00:0000
  ( $oWk, $oDir )
  Name: "R\0o\0o\0t\0 \0E\0n\0t\0r\0y\0"
  No: ~
    - 0
    - 0
    - 16
    - 4
    - 10
    - 100
# and so on...

This should contain all I need to create an OLE file from this set of objects. I’m using this as a sneaky way of not reading the spec, at least not yet. As the old title goes: Algorithm + Data Structure = Program. Using YAML (or Data::Dumper) gives me the data structure, copying the Perl 5 code into Raku gives me the algorithm.

I should almost be able to keep line-for-line fidelity, so when a patch is posted to the Perl 5 source I can import it into Raku without too much trouble. But once I’ve got a better test base and a few users in Raku I’ll probably rewrite this whole module in a more Raku-ready fashion. I can keep the old module around for reference.

Encoding worries

But we’ve also got a surprise lurking here. “R\0o\0ot\0 \0E\0n\0t\0r\0y\0” looks like binary garbage, but is actually UCS-2, I think. If it is, then the OLE file is limited to a subset of Unicode. I can put restrictions on it later if I have to, but ATM I actually don’t care.

I’ve done enough time in the i18n salt mines that I know how to deal with this. Store the string in the best format possible (UTF-8 here) internally. When the time comes to write it to the network or disk, translate it to the final encoding.

This way I can see what all the attributes are at a glance without changing encoding. I can also manipulate everything using regular Raku code until the last moment. If I have to, I can use Raku’s gradual typing to constrain the string. More importantly, I don’t have to do any of this now.

Got any change?

This means I’m going to change things just a little bit more. When data gets added to ‘Name’ I’m going to assume it’s UTF-8. Since I’m not doing any I/O yet, I can make whatever assumptions I want. Keeping the internals simple keeps my life simple, at least.

So I’ll write out a quick is-deeply test and get on with things:

is-deeply $oDt, (
  Name => 'Root Entry',
  Time2nd => ( 0, 0, 16, 4, 10, 100 ),
  # ...
  Child => ( $oWk, $oDir )

This looks pretty straightforward, and almost how you’d write the original test in Perl 5. It won’t run yet, but that’s something we’ll tackle in the next part in the series.

I’m not done quite yet, because I’ve got a lot of these things to write, and not all of them may have the ‘Child’ attribute. I could write a tiny method that skipped over the ‘Child’ attribute along with anything else I wanted, but that felt clumsy. It looked like:

ok sorta-deeply $oDt, (
  Name => 'Root Window',
  Time2nd => ( 0, 0, 16, 4, 10, 100 ),
  # ...
), ( 'Child' );

And notice that sorta-deeply is a function that does all the work, then passes a simple Bool back to the test. I’d end up writing all of the code that is-deeply does (except for the recursion), and get something back that’s less useful.

Next time we’ll get into making these tests pass. I’m writing the next section right after this, but you won’t get to see it for another week or so, I’m afraid. If you have questions or comments about the first part of this series, please feel free to comment below.