Search speed issue

I stumbled onto this situation. I am searching a layer for address like this:

    srch = "[NAME] = """ & strt & """"
    srch &= " and ("
    srch &= " ([FROMLEFT]<=" & addrnum & " and [TOLEFT]>=" & addrnum & ")"
    srch &= " or ([FROMRIGHT]<=" & addrnum & " and [TORIGHT]>=" & addrnum & ")"
    srch &= ")"
    srch &= " and ("
    srch &= "[ZIP_LEFT] = """ & zipin & """"
    srch &= " or "
    srch &= "[ZIP_RIGHT] = """ & zipin & """"
    srch &= ")"

    sf.Table.Query(srch, res, errstr)

At the top of the class, I tried two ways to get an instance of the layer:

    'This way, exporting to make a copy results in very fast search - 17 seconds to look up 137 addresses
    Dim sf_temp As MapWinGIS.Shapefile
    sf_temp = Form1.Map1.get_GetObject(Form1.Handle_RoadsLayer) ' this gets layer for searching
    sf_temp.SelectAll()
    sf = sf_temp.ExportSelection

    ' this is the alternative to the above creation of temp layer 
    ' runs much  to geocode 137 points - 2min 13 sec
    sf = Form1.Map1.get_GetObject(Form1.Handle_RoadsLayer) ' this gets layer for searching

It is dramatically slower searching to do the second way. Any thoughts?

This is interesting and an unexpected result.
Can you share your shapefile? So we can try to debug.

Here is the shapefile.

RD12115.zip (3.5 MB)

I made a unit test to quickly test some variations and I also get different timing.
I need to debug this more.

Here’s my unit test:

var sf = Helper.OpenShapefile("RD12115.shp", true, this);

const string query = @"[NAME] =""la France Ave"" and (([FROMLEFT]<=4201 and [TOLEFT]>=4499) or ([FROMRIGHT]<=4200 and [TORIGHT]>=4498)) and ([ZIP_LEFT] = ""34286"" or [ZIP_RIGHT] = ""34286"")";
Debug.WriteLine(query);
object result = null;
string errorString = null;

var stopWatch = new Stopwatch();
stopWatch.Start();
Assert.IsTrue(sf.Table.Query(query, ref result, ref errorString), "Table.Query: " + errorString);
stopWatch.Stop();
Debug.WriteLine("Time it took to query filebased shapefile: " + stopWatch.Elapsed);

Assert.IsNotNull(result);
var indexes = result as int[];
Assert.IsNotNull(indexes);
Assert.IsTrue(indexes.Length > 0, "No results found");

// Make memory copy:
sf.SelectAll();
var sfTemp = sf.ExportSelection();
sf.SelectNone();

stopWatch.Restart();
Assert.IsTrue(sfTemp.Table.Query(query, ref result, ref errorString), "Table.Query: " + errorString);
stopWatch.Stop();
Debug.WriteLine("Time it took to query memory shapefile: " + stopWatch.Elapsed);
var fileLocation = Path.Combine(Path.GetTempPath(), "speed.shp");
Helper.SaveAsShapefile(sfTemp, fileLocation);

// Again after saving
stopWatch.Restart();
Assert.IsTrue(sfTemp.Table.Query(query, ref result, ref errorString), "Table.Query: " + errorString);
stopWatch.Stop();
Debug.WriteLine("Time it took to query memory shapefile after saving: " + stopWatch.Elapsed);
sfTemp.Close();

// Use QTree
sf.UseQTree = true;
stopWatch.Restart();
Assert.IsTrue(sf.Table.Query(query, ref result, ref errorString), "Table.Query: " + errorString);
stopWatch.Stop();
Debug.WriteLine("Time it took to query shapefile with QTree index: " + stopWatch.Elapsed);

// Use spatial index:
Assert.IsTrue(sf.CreateSpatialIndex(sf.Filename), "Cannot create spatial index");
stopWatch.Restart();
Assert.IsTrue(sf.Table.Query(query, ref result, ref errorString), "Table.Query: " + errorString);
stopWatch.Stop();
Debug.WriteLine("Time it took to query shapefile with spatial index: " + stopWatch.Elapsed);

// Open temp shapefile:
var sfTemp2 = Helper.OpenShapefile(fileLocation);
sfTemp2.UseQTree = true;
stopWatch.Restart();
Assert.IsTrue(sfTemp2.Table.Query(query, ref result, ref errorString), "Table.Query: " + errorString);
stopWatch.Stop();
Debug.WriteLine("Time it took to query temp shapefile with QTree index after loading from disk: " + stopWatch.Elapsed);

The Helper class is in Github.

The results of my test:

Time it took to query filebased shapefile: 00:00:01.2651178
Time it took to query memory shapefile: 00:00:00.2310401
Time it took to query memory shapefile after saving: 00:00:00.6871266
Time it took to query shapefile with QTree index: 00:00:00.7006017
Time it took to query shapefile with spatial index: 00:00:00.7213261
Time it took to query temp shapefile with QTree index after loading from disk: 00:00:01.2230481

I updated the test a bit and noticed that querying the first time a file-based shapefile is always slower than the second time. So some caching must be involved.
The in-memory shapefile stays much faster, but has no difference in the first time or the second time, assuming it is already fully optimized:

Time it took to query filebased shapefile: 00:00:01.1245868
Time it took to query filebased shapefile second time: 00:00:00.6824004

Time it took to ExportSelection: 00:00:01.2607499

Time it took to query memory shapefile: 00:00:00.2366985
Time it took to query memory shapefile second time: 00:00:00.2387093

Time it took to query memory shapefile after saving: 00:00:00.6258544
Time it took to query memory shapefile after saving second time: 00:00:00.2375523

Time it took to query shapefile with QTree index: 00:00:00.6741876
Time it took to query shapefile with QTree index second time: 00:00:00.6730707

Time it took to query shapefile with spatial index: 00:00:00.6897191
Time it took to query shapefile with spatial index second time: 00:00:00.6699897

Time it took to open shapefile: 00:00:00.0165911
Time it took to query  temp shapefile with QTree index after loading from disk: 00:00:01.1004431
Time it took to query  temp shapefile with QTree index after loading from disk second time: 00:00:00.6669803

A file-based spatial index or a QTree index has no effect, which makes sense because they both are spatial-based and are not indexing attributes.

I tried debugging MapWinGIS to see what is the difference between the file-based shapefile and the in-memory version but couldn’t find anything. Most likely because I’m not a good C++ debugger :wink:

@jerryfaust When you have some time, could you have a look at this?
I also created https://mapwindow.atlassian.net/browse/MWGIS-166 for this.