Find

From APL Wiki
Jump to navigation Jump to search

Find () is a dyadic primitive function which tests if the left argument appears as a contiguous subarray of the right argument.

Examples

Both arguments can be arrays of any shape. The entire left argument is tested against each position in the right argument. The result is a boolean array having the same shape as the right argument, where a 1 indicates the position of the first element of the matched subarray (which can be seen as the "leftmost" or "top left" position in case of a vector or matrix). If the left argument has lower rank, it is treated as if the shape is prepended with ones. If the left argument has higher rank, Find does not error, but it is never found in the right argument (resulting in an all-zero array).

      'ANA'⍷'BANANA'  ⍝ Matches may overlap
0 1 0 1 0 0

      WEEK
SUNDAY
MONDAY
TUESDAY
WEDNESDAY
THURSDAY
FRIDAY
SATURDAY
      'DAY'⍷WEEK  ⍝ Find the pattern 'DAY' in WEEK; right arg may have higher rank
0 0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 0 1 0 0 0
      WEEK⍷'DAY'  ⍝ WEEK not found in 'DAY'; left arg may have higher rank but it is never found
0 0 0

For nested arrays, Find tests for exact match between the elements.

      'BIRDS' 'NEST'⍷'BIRDS' 'NEST' 'SOUP'
1 0 0

Model

Find can be modelled as follows, where all possible subarrays of the right argument are checked to see if they match the left argument:[1]

ebar←{⎕IO←0
 r←(≢⍴⍺)⌈≢⍴⍵                    ⍝ maximum rank
 r>≢⍴⍺:(⍺⍴⍨(⍴⍺),⍨(r-≢⍴⍺)⍴1)∇ ⍵  ⍝ if ⍺ has lesser  rank, make it the same rank
 (⍴⍺)∨.>r↑(⍴⍵),¯1:(⍴⍵)⍴0        ⍝ return 0s if ⍺ has greater rank or is longer
 ww←⍵
 (⍴⍵) ↑ ⍺∘{⍺≡(⍴⍺)↑⍵↓ww}¨ ⍳(×⍴⍺)+(⍴⍵)-⍴⍺
}

Empty left argument

Implementations differ in their treatment of empty left arguments:

  • APL2, GNU APL, NARS2000, and Dyalog APL indicate positions where the left argument can fit, even if the prototypes don't match.
  • APLX never finds any empty arrays.
  • APL+ finds empty arrays everywhere, even where they would extend beyond the edges of the right argument.

Discussion

In 2021, internal discussions about the correctness of the implemented primitive for empty left arguments happened at Dyalog Ltd.[2]

In February, Roger Hui posed that the primitive had a bug in that it was finding empty subarrays of the wrong type, while defined in terms of match (), which does distinguish between empty arrays of unequal type. Adám Brudzewsky devised an alternative mental model to describe Find's behaviour where, rather than checking if the left argument could be extracted from the right argument by peeling off outer elements, one could check if the left argument could be overlaid on the right argument, without the right argument changing. He devised two almost identical models, to emphasise the difference between the extraction model and the overlay model:

ee←{ ⍝ extraction model
    ⎕IO←0
    ra←≢sa←⍴⍺ ⋄ rw←≢sw←⍴⍵
    rm←ra⌈rw
    rm>ra:⍵ ∇⍨⍺⍴⍨sa,⍨1⍴⍨rm-ra
    sa∨.>rm↑sw,¯1:sw⍴0
    _Extract_←{ ⍝ does extracting ⍺⍺ from ⍵⍵ change ⍺⍺?
        ⍺⍺≡⍺↑⍵↓⍵⍵
    }
    sw↑sa∘(⍺ _Extract_ ⍵)¨(-⍨∘×⍨sa)↓⍳sw
}
eo←{ ⍝ overlay model
    ⎕IO←0
    ra←≢sa←⍴⍺ ⋄ rw←≢sw←⍴⍵
    rm←ra⌈rw
    rm>ra:⍵ ∇⍨⍺⍴⍨sa,⍨1⍴⍨rm-ra
    sa∨.>rm↑sw,¯1:sw⍴0
    _Overlay_←{ ⍝ does overlaying ⍺⍺ on ⍵⍵ change ⍵⍵?
        ⍵⍵≡⍺⍺@((⍳⍺)+⊂⍵)⊢⍵⍵
    }
    sw↑sa∘(⍺ _Overlay_ ⍵)¨(-⍨∘×⍨sa)↓⍳sw
}

Morten Kromberg speculated that the behaviour stemmed from early flat APL where Match didn't exist. Instead, common practice was to use And-reduction (often written as the inner product ∧.=) over element-wise equality, which ignores type mismatches because the comparison of two empty arrays (a scalar function application) itself is empty, thus making the reduction yield the identity element of And, which is true (1).

In April, Hui wrote that he disagree[d] strongly with the "alternative APL and mental model" which Brudzewsky had devised, because it among other things [meant he] can not give a good accounting of it. Also that all the descriptions (APL or non-APL) of string search/find that [he had] seen do not use that mental model..

Kromberg agreed with Hui that Brudzewsky's model was strained at best and clearly a modern construction based on a more complete understanding of and prototypes, than a possible explanation for what the implementors where thinking when they did this work. He reiterated his theory about And-reductons over equality in a moving window, thus posing that the current behaviour can be seen as correct.

In May, Brudzewsky found support for Kromberg's theory, based on that exact usage in a conference proceeding,[3] finding that redefining accordingly as {(⍺≡⍥⍴⍵)∧(∧/⍺≡¨⍥,⍵)} would make Hui's ebar model align with the behaviour of the primitive as implemented. Hui promised to write an appendix to his earlier forum post at an appropriate time, but passed away before being able to do so.

See also

External links

Documentation

References

  1. Hui, Roger. ⍷ follies. Dyalog Forums. 16 Feb 2021.
  2. Dyalog Ltd. Internal emails. more ⍷ follies , 15–19 Feb; ancient bug in ⍷ with empty left argument, 7–8 Apr, and 26 May 2021.
  3. Falkoff, Adin. [A note on pattern matching: Where do you find the match to an empty array?] APL79. doi:10.1145/800136.804470.
APL built-ins [edit]
Primitives (Timeline) Functions
Scalar
Monadic ConjugateNegateSignumReciprocalMagnitudeExponentialNatural LogarithmFloorCeilingFactorialNotPi TimesRollTypeImaginarySquare Root
Dyadic AddSubtractTimesDivideResiduePowerLogarithmMinimumMaximumBinomialComparison functionsBoolean functions (And, Or, Nand, Nor) ∙ GCDLCMCircularComplexRoot
Non-Scalar
Structural ShapeReshapeTallyDepthRavelEnlistTableCatenateReverseRotateTransposeRazeMixSplitEncloseNestCut (K)PairLinkPartitioned EnclosePartition
Selection FirstPickTakeDropUniqueIdentityStopSelectReplicateExpandSet functions (IntersectionUnionWithout) ∙ Bracket indexingIndexCartesian ProductSort
Selector Index generatorGradeIndex OfInterval IndexIndicesDealPrefix and suffix vectors
Computational MatchNot MatchMembershipFindNub SieveEncodeDecodeMatrix InverseMatrix DivideFormatExecuteMaterialiseRange
Operators Monadic EachCommuteConstantReplicateExpandReduceWindowed ReduceScanOuter ProductKeyI-BeamSpawnFunction axis
Dyadic BindCompositions (Compose, Reverse Compose, Beside, Withe, Atop, Over) ∙ Inner ProductDeterminantPowerAtUnderRankDepthVariantStencilCutDirect definition (operator)
Quad names Index originComparison toleranceMigration levelAtomic vector