3

I'll be the first to admit that Perl is not my strong suit. But today I ran across this bit of code:

my $scaledWidth = int($width1x * $scalingFactor);
my $scaledHeight = int($height1x * $scalingFactor);
my $scaledSrc = $Media->prependStyleCodes($src, 'SX' . $scaledWidth);

# String concatenation makes this variable into a
# string, so we need to make it an integer again.
$scaledWidth = 0 + $scaledWidth;

I could be missing something obvious here, but I don't see anything in that code that could make $scaledWidth turn into a string. Unless somehow the concatenation in the third line causes Perl to permanently change the type of $scaledWidth. That seems ... wonky.

I searched a bit for "perl assignment side effects" and similar terms, and didn't come up with anything.

Can any of you Perl gurus tell me if that commented line of code actually does anything useful? Does using an integer variable in a concatenation expression really change the type of that variable?

Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
  • I guess you never used a dynamically typed language? If that's the case, you'll be surprised by the magic it allows, for example when loading dynamic JSON data (and Perl even has proper auto-vivification) – ChatterOne Jan 27 '20 at 21:43
  • @ChatterOne I have used dynamically typed languages, but I'll admit that they're not something I've spent a lot of time with. My understanding was that the variable's type was fixed at assignment. If you wanted to change the type, you made a new assignment. This "magic" that can change the type depending on how its used seems like an exceptionally bad idea if it can lead to ambiguity. Do other dynamically typed languages exhibit similar behavior? – Jim Mischel Jan 27 '20 at 21:47
  • 1
    @JimMischel This behavior could be more precisely referred to as "weak typing" - there are no types, either at compile or run time. (This of course is not the whole story, there *are* strong types such as the reference type which is not fluid, but string and number are not strong types.) – Grinnz Jan 27 '20 at 21:49
  • Not sure why this was marked as "Off Topic," but what the heck. I'd also like to know why it was downvoted. Because it's "Off Topic?" – Jim Mischel Jan 27 '20 at 21:50
  • @JimMischel the variable doesn't change type — its type is "scalar", and that can't change. There is no "string" type, nor "number", only things which can be asked to participate in stringy or numeric operations. Normally the storing of converted values that goes on under the hood for performance reasons has no visible effect (it's not *meant* to), but at the interface with something (like JSON) that insists that `42` and `"42"` are different values, something has to give. – hobbs Jan 27 '20 at 21:53
  • Perl is built around the concept of automatically casting values to the type needed. Addition needs a number? It will impose a scalar assignment on its operands, and it will cast the result of the operand to a number. `print` needs strings? It will cast any non-strings to strings then print them. – ikegami Jan 28 '20 at 02:15
  • @ikegami I understand the concept of automatic casting. But that's not the whole story, as the answers below show. – Jim Mischel Jan 28 '20 at 06:32

2 Answers2

6

It is only a little bit useful.

Perl can store a scalar value as a number or a string or both, depending on what it needs.

use Devel::Peek;
Dump($x = 42);
Dump($x = "42");

Outputs:

SV = PVIV(0x139a808) at 0x178a0b8
  REFCNT = 1
  FLAGS = (IOK,pIOK)
  IV = 42
  PV = 0x178d9e0 "0"\0
  CUR = 1
  LEN = 16

SV = PVIV(0x139a808) at 0x178a0b8
  REFCNT = 1
  FLAGS = (POK,pPOK)
  IV = 42
  PV = 0x178d9e0 "42"\0
  CUR = 2
  LEN = 16

The IV and IOK tokens refer to how the value is stored as a number and whether the current integer representation is valid, while PV and POK indicate the string representation and whether it is valid. Using a numeric scalar in a string context can change the internal representation.

use Devel::Peek;
$x = 42;
Dump($x);
$y = "X" . $x;
Dump($x);

SV = IV(0x17969d0) at 0x17969e0
  REFCNT = 1
  FLAGS = (IOK,pIOK)
  IV = 42

SV = PVIV(0x139aaa8) at 0x17969e0
  REFCNT = 1
  FLAGS = (IOK,POK,pIOK,pPOK)
  IV = 42
  PV = 0x162fc00 "42"\0
  CUR = 2
  LEN = 16

Perl will seamlessly convert one to the other as needed, and there is rarely a need for the Perl programmer to worry about the internal representation.

I say rarely because there are some known situations where the internal representation matters.

mob
  • 117,087
  • 18
  • 149
  • 283
5

Perl variables are not typed. Any scalar can be either a number or a string depending how you use it. There are a few exceptions where an operation is dependent on whether a value seems more like a number or string, but most of them have been either deprecated or considered bad ideas. The big exception is when these values must be serialized to a format that explicitly stores numbers and strings differently (commonly JSON), so you need to know which it is "supposed" to be.

The internal details are that a SV (scalar value) contains any of the values that have been relevant to its usage during its lifetime. So your $scaledWidth first contains only an IV (integer value) as the result of the int function. When it is concatenated, that uses it as a string, so it generates a PV (pointer value, used for strings). That variable contains both, it is not one type or the other. So when something like JSON encoders need to determine whether it's supposed to be a number or a string, they see both in the internal state.

There have been three strategies that JSON encoders have taken to resolve this situation. Originally, JSON::PP and JSON::XS would simply consider it a string if it contains a PV, or in other words, if it's ever been used as a string; and as a number if it only has an IV or NV (double). As you alluded to, this leads to an inordinate amount of false positives.

Cpanel::JSON::XS, a fork of JSON::XS that fixes a large number of issues, along with more recent versions of JSON::PP, use a different heuristic. Essentially, a value will still be considered a number if it has a PV but the PV matches the IV or NV it contains. This, of course, still results in false positives (example: you have the string '5', and use it in a numerical operation), but in practice it is much more often what you want.

The third strategy is the most useful if you need to be sure what types you have: be explicit. You can do this by reassigning every value to explicitly be a number or string as in the code you found. This assigns a new SV to $scaledWidth that contains only an IV (the result of the addition operation), so there is no ambiguity. Another method of being explicit is using an encoding method that allows specifying the types you want, like Cpanel::JSON::XS::Type.

The details of course vary if you're not talking about the JSON format, but that is where this issue has been most deliberated. This distinction is invisible in most Perl code where the operation, not the values, determine the type.

Grinnz
  • 9,093
  • 11
  • 18
  • So whether that assignment statement in my code example is relevant will depend on the context below? Thanks. Since it's been in the code for quite some time, and the code does work, I'm going to leave it alone for now. – Jim Mischel Jan 27 '20 at 21:52
  • @JimMischel Correct, it is just a rearrangement of the internals of the value, what effect that has depends on whether anything looks at the internals of it afterward. – Grinnz Jan 27 '20 at 21:53