Prefetch-Method crashes MS-Access

Hi there,

i discussed this topic with Jerry start of this year for a while.
Since you released a new stable version, i took the time to test that behaviour again.

Prefetch_Test_db.zip (308.7 KB)

Attached you can find an ms access database, that contains one form, together with a small shapefile containing tree-locations.
When opening the form, it creates a background tile-Layer from a locally running osm-tile-server (windows linux subsystem is really great), - but this can easiely be changed in Form_Load to a common osm-tile server -, reprojects the shapefile, creates a thematic map based on a field in the shapefile and displays the results.

The plan is, to give the user the opportunity to store the background tiles locally, to be able to copy it all to a Laptop with no internet connection. For this i implemented the Prefetch-Method to fill a local tile-cache with the tiles of all zoom-levels.

So open the database, open the form (- to see the messages of the callback-interface you need to open the integrated vbe and the immediate window) and then click the prefetch-button.

For me, this crashes ms access reliably.

I would be very happy to get this component running reliable in an ms access form.
In spring this year even clicking on the identify-button on the form and double-clicking one tree reliably crashed ms access, but this doesn’t seem to happen with this new version, good.

Any ideas and help is very welcome.
I’ve a customer who would even like to pay a developer to get it this component in a whole stable in ms access, just get in contact with me, if you’re interrested.

So have a nice weekend
and best regards

Stefan

Hello @jTati

I started to review this issue, and realize that I do not have the local tile directory. Is that very large? could you zip some of it and upload it?

Thanks,
Jerry.

Good Morning Jerry,

the sample database generates a sqlite database to store the tiles. For this particular sample with it’s very small shape-file, the database shouldn’t get beyond xx MB in size. In february i posted you the original shapefile, that was much larger, and there this sqlite db became 100 - 200 MB in Size, depending on when the crash occoured.
I actually cannot send you such a database, since that is the process that reliably crashes MS access, it just vanishes from screen.

But if there is anything i can provide to help getting this fixed, - will do.

Thank’s for looking into this
and
kind regards

Stefan

Hello Stefan.

Ok, well I’ve had some success.

So it looked to me (and I think you said this also) that this newest database (the Prefetch Test) relies on a local disk-based tile service, similar to what is created using your older database (MapWinGIS TilesTest). So I revived the earlier one (which fetches from OpenStreetMaps, levels 1 through 19), and I couldn’t get past around level 12 before it crashed . The prefetch does background processing (that I still need to investigate) that the Access environment can’t keep up with (and I don’t yet know why).

First I changed the OCX to only allow one thread at a time to do the read from OSM. Even with this, it couldn’t keep up. I then noticed that if I stepped through the Access code, and with a breakpoint at each iteration of the loop, I would stop and wait, to let the background processing catch-up, so-to-speak. With this method, I was able to process up to about level 17 before it crashed (trying to process thousands of tiles in the background). And when it crashes, it is usually in an MS Office dll (Mso10Win32Client.dll).

Although levels 18 and 19 were incomplete, I switched to the newer database (Prefetch) that reads from the disk tiles to load into sqlite. One thing I changed was to use a new feature of the OCX that allows local disk-based tiles as files (not a hosted service).

tlsTemp.Providers.Add 1030, "localWMST", "file:///C:\Temp\TilesTest\tiles\{zoom}\{x}\{y}.png", tkTileProjection.SphericalMercator, 1, 19

I set up to pre-fetch levels 16 down to 12. Even so, it was still crashing, but I was able to catch one of the errors, and it was in the StopFunction (called on every tile) during the bulk tile loading. So I (at least for now) set the StopFunction to Nothing so that it won’t get called. I still got an error, so I went in again put a breakpoint at every iteration of the prefetch loop in Access, and in this case, was able to get through every level. Here’s the result in your app.

I’m generally convinced that there is an issue with background threading or events in the MS Access world. But the full verdict is not yet in. At some point, may be able to catch the root culprit. Or it may be there is a need for a custom OCX to work more reliably with Access. As it is, I already maintain 2 different custom OCX’s, for a couple customers with various proprietary reasons. (one is for the Progress environment, which is a similar ‘interpretive’ environment like Access.

That’s all for now. We can talk more about where to go from here.

Regards,
Jerry.

Hello Jerry,

thanks for investing your time!

I also spend my afternoon yesterday playing around with the debugger and the current source code, and i had the same experiences: Crashes occoured in different ms access modules, be it vbe7.dll or msaccess.exe. From source-code side it always had to do with modules around Thread-Handling while loading tiles and the different events (callback / stopIF / TilesLoaded). The most common error was: “Invalid tile reference number”.
For my very basic understanding this looks as if something gets mixed up with the expected tile references in the asynchronous background loading. Your answer points into the same direction.
I also looked into the source code for tile handling and found nearly no comment or documentation in the code, so no chance for me as a C+±newbe to understand anything at all.

By the way:
The tile source i use is a local running tile-Server, so no file based access, it’s technically the same, as connecting to an osm tile server. The only difference is, that it might serve the tiles much faster. That might be the reason why choosing the build-in osm-tileserver as tile source, makes it run more stable, cause the tiles come in much slower.

I don’t know, if you see it that way also, but if this turns to come out to be a general problem with thread-handling in the ocx, and then a different branch of this project might not be the right approach, but I’m open to everything.

Have a nice day and best regards
Stefan

Hello Stefan.

Just to clarify, I don’t believe there are threading issues with the OCX. In fact, the change I made (to force single threaded loads of OSM tiles) was dumbing down the OCX, and reverting back to an earlier state that was considered a bug (see MWGIS-207).

I agree with your opening paragraphs, but believe it specifically has something to do with how MS Access affects the asynchronous background loading. We don’t see these things when running within compiled applications. So it may be that we may need to enforce more single-threading, synchronous behavior within the OCX. Right now (by default) it is compiled with the so-called “free threading” model, as opposed to the STA (single-threaded apartment) model. That will be one thing I can easily change to check out the behavior. But as we dig down, there may be other things as well.

It is unfortunate that there are many sections of the code with very few comments. As I go in to make fixes/changes/etc, I try to add comments to clarify intent.

I still want to track down that “Invalid tile reference number”, and that may give us more clues.

Regards,
Jerry.

Hello again.

I haven’t forgotten about this. I got back into debugging a couple nights ago. It looks like the “Invalid tile reference number” is a bug in the OCX (releasing a reference for which it had not called AddRef). I’ve got more still to do, and I’ll let you know if I discover anything.

Regards,
Jerry.

Hi Jerry,

thanks, and if there’s anything i can do, let me know.

best regards
Stefan

Hello Stefan.

Working with the original MapWinGis_TilesTest, I think it’s now mostly working. You can try things on your end.

First, you want to keep the TilesThreadPoolSize = 1. After other successes, I tried putting this back up, and it crashed again.

Second, the problem with the StopFunction callback is that it was Private rather than Public.

Public Function IStopExecution_StopFunction() As Boolean
    IStopExecution_StopFunction = m_bolStop
End Function

Third, there are thousands of images being fetched, and it takes hours. The loop below executes rather quickly, but the background work continues on. Notice that after the loop, I added a GoTo Function_Exit. All of the actions you have after the loop should not yet take place, because the operation isn’t finished. You don’t yet want to clear the caches, and you don’t want to call Stop on the Stop function, because that actually stops the background processing that is taking place.

There’s a note on the Prefetch function in the documentation that indicates when the process is complete, sending a -1 to the callback. Presumably, you could watch for that event, and then do the clean up and close. In my case, I just let it run and went to bed, although it still ran out of memory before it was complete. Even so, I got a lot of tiles out in the 18 and 19 zoom range. You may need to try breaking it down, and at those higher levels, where there are thousands of images involved, just do one zoom level at a time; let it finish, then move on to the next.

I then went back to the newer one, the Prefetch_Test, and with similar fixes (i.e. to the StopFunction), it ran through without error.

    For intZoomRunner = 1 To 19

        Debug.Print "Lade Zoom-Level: " + CStr(intZoomRunner)

        bolTilesLoaded = False
    
        lngResult = Me.MapMain.Tiles.PrefetchToFolder(extProvider, intZoomRunner, Me.MapMain.Tiles.ProviderId, strTemp, ".png", clsStopFunction)
     
        DoEvents
        WaitASecond 1
        DoEvents
        WaitASecond 1
        DoEvents
        WaitASecond 1
        DoEvents
    
        Debug.Print "-------------------------------------"
        Debug.Print "Zoom: " + CStr(intZoomRunner) + ", Prefetch-Result: " + CStr(lngResult)
        Debug.Print "-------------------------------------"
        DoEvents: DoEvents: DoEvents
    
    Next

    Call Me.MapMain.LockWindow(tkLockMode.lmUnlock)
    GoTo Function_Exit

    Me.MapMain.Tiles.ClearCache tkCacheType.RAM
    Me.MapMain.Tiles.DoCaching(tkCacheType.RAM) = True
    Me.MapMain.Tiles.UseCache(tkCacheType.Disk) = True

Function_End:
    On Error Resume Next

    Call Me.MapMain.LockWindow(tkLockMode.lmUnlock)

    If (Not (clsStopFunction Is Nothing)) Then clsStopFunction.Stoppen
    DoEvents: DoEvents: DoEvents

    If (Not (myGlobalSettings Is Nothing)) Then _
        Call myGlobalSettings.StopLogTileRequests

    If Len(strErrorMessage) > 0 Then _
        Debug.Print strErrorMessage

Function_Exit:
Exit Sub

Let me know what your experience is.

Regards,
Jerry.

P.S. I just remembered that I’ve slightly modified the OCX on my end. I’ll have to figure out how much those changes contributed to the current solution, or whether they can be undone. I guess you’ll know more when you try it on your end…

Hi Jerry,

and thanks for your effort.

I took the code, changed TileThreadPoolSize to 1, changed the StopInterfaces functions to PUBLIC, - but that didn’t change a thing. In fact Access crashed even faster.
The full error message is:

Assertion failed: Invalid reference count for a tile

What i’m fiddeling around for a while is this TilesProvider.GlobalCallback that you mentioned to send a -1, when the asynchronous tiles load finishes.
I cannot set the GloblCallback to the tileprovider and i cannot get it’s properties, if it’s not set. The application callback is set, but with this setting the properties of TileProvider.globalCallback are still unavailable.
I think i need some enlightment how to use this GlobalCallback-Property.

To store the tiles as a bunch of files in the filesystem is not usefull for my application:
The database and the background tiles should be copied to a mobile device and as you know, copying thousands of small files to a ssd or usb device takes forever, but copying one huge file, as it would be when the tiles get stored in one sqlite db, is as fast as the interface can get.

If you’ve changed the mapwingis-code, maybe you can find a way to send it me, - i’m able to compile the current code and run a debugger, as you explained it to me start of the year.

Thanks and have a nice day
Stefan

Hi Jerry,

reading through the documentation you mentioned: TilesProvider.GlobalCallback is deprecated and replaced by Globalsettings.applicationcallback, a property my sample database uses already.
Next thing is, this callback-Objekt has no method or property to get a return value like -1.
When i add a public property reflecting the percentage that is internally available to the Applicationcallback-Object, it stays all the time at 0, never giving back -1 even when the prefetch process for let’s say: one tile is finished.

By the way:
MS Access crashes also, when you just pan around in the map long enough, the prefetch method makes it just a lot faster.

cheers
Stefan

Just found this:

When debugging mapwingis and putting a breakpoint in TileBulkLoader.cpp in “void TileBulkLoader” on the line where it checks if the _callback-Object is set, it tells me when running a prefetch that “_callback == NULL” and so the return value of -1 can never be returned, - even that i sat a callback-object.
…but as i said, i’ve absolutely no idea about C++

Hello.

You’ve got a few questions here. I’ll start with the last.

Looking through the source code, it looks like if you set Tiles.Callback, that callback pointer is handed into the various Prefetch methods. Prefetch hands it into the TileBulkLoader (line 133 of PrefetchManager).

What might be the case is that this must be specifically set into the Tiles class, as opposed to the GlobalSettings Callback. Historically, the OCX had individual callbacks for each class, and then moved towards a Global callback. (One callback to rule them all). But from what I see at the moment, this requires the Tile callback (axMap1.Tiles.GlobalCallback).

Then, in the TileLoaded method of TileBulkLoader (which is likely where you were looking), the _callback variable should not be null, and then it exceeds the totalCount, it will return a -1 in the Progress callback.

Ok, I just tested it, and indeed you’ll get the callback with the following:

If (myCallBack Is Nothing) Then
    Set myCallBack = New clsCallback
    myGlobalSettings.ApplicationCallback = myCallBack
    myGlobalSettings.CallbackVerbosity = MapWinGIS.tkCallbackVerbosity.cvAll
    Dim t As Tiles
    ' even though Tiles doesn't show up in Intellisense,
    ' it's there, and will be found late-bound
    Set t = Me.Map0.Tiles
    t.GlobalCallback = myCallBack
End If

More to come.

Jerry.

Hello again.

Here are some of the changes I made to the OCX. You can mix and match and see which provides for more predictable behavior.

At the top of Stdafx.h

#define _ATL_SINGLE_THREADED // _ATL_APARTMENT_THREADED // _ATL_FREE_THREADED	// 

The standard release is Free Threaded, I have been running Single threaded for my testing.

At the top of BaseProvider.cpp, GetTileHttpData, roughly line 63, uncomment the CSingleLock call, which will force single-threaded access to the fetch routine.

CMemoryBitmap* BaseProvider::GetTileHttpData(CString url, CString shortUrl, bool recursive)
{
    // single-file access to the tile load
    // MWGIS-207; allow multiple thread access
    CSingleLock lock(&_clientLock, TRUE);

In SQLiteCache.cpp, roughly line 259, comment out the Release function, which is causing the Tile Reference count assertion:

//tile->Release(); // no AddRef was ever called

and similarly, at the end of DiskCache.cpp, AddTile, roughly line 122

//tile->Release(); // no AddRef was ever called

And I think you’re fine going back to a local http server rather than the file-based mechanism.

I think that covers all of the changes.

Regards,
Jerry.

One more thing to try. Don’t set a callback or a stop function and see if that contributes to stability; it seems to on my end. (You can pass Nothing in for the StopInterface).

Curious how this affects things on your end.

Jerry.

I implemented the changes you suggested, but first things first:
to fake a kind of callback i changed the prefetch loop to this:

###########################################
For intZoomLevel = intMAX_ZOOM_LEVEL To 12 Step -1

DoEvents
lngResult = tlsTemp.Prefetch(dblLatMin, dblLatMax, dblLongMin, dblLongMax, intZoomLevel, tlsTemp.ProviderId, Nothing)
DoEvents
Debug.Print "Zoom-Level: " + CStr(intZoomLevel) + " / Tiles to Fetch: " + CStr(lngResult)
DoEvents

Do
   DoEvents
   lngTileFileSize = FileLen(strTileFile)
   DoEvents
   Call WaitASecond(1)
   DoEvents
   Call WaitASecond(1)
   DoEvents
   Call WaitASecond(1)
   DoEvents
   Call WaitASecond(1)
   DoEvents
   Call WaitASecond(1)
   DoEvents
   Call WaitASecond(1)
   DoEvents
   
   lngResult = FileLen(strTileFile)
          
Loop While lngResult > lngTileFileSize

   DoEvents
   Call WaitASecond(1)
   DoEvents
   Call WaitASecond(1)
   DoEvents
   Call WaitASecond(1)
   DoEvents
   Call WaitASecond(1)
   DoEvents
   Call WaitASecond(1)
   DoEvents

Debug.Print "Finished Zoom-Level: " + CStr(intZoomLevel)
DoEvents

Next
###########################################

so the program doesn’t start to fetch the next zoom level before the first call finished.

With your changes and this loop the prefetch ran till a sqllite-db size of 95.083.520 Byte (reproducable) and then it didn’t grow any further and so the loop ran on.

Just because of curiousity i tested the same with the released ocx (switched of all stopInterfaces and callbacks) and it behaved exactly the same, - even with 16 Threads in one TileThreadPool.
(regsvr32.exe /u / regsvr32.exe <released.ocx>)

So from here it looks, as if the switched of stopinterface & callbacks are the reason for this improved stability.
But after the sqlite-Tilecache reached that size mentioned above, the map was afterwards not usable any longer: Extremly slow, and sometimes Access just vanished from screen when panning around.
Access used at that time somewhere around 600 MB Ram, - a lot more then after opening the form, but from my point of view not critical high.

on two tests a fresh started access-db was not able to use that tilecache.db created with the process above, - the map was unbearable slow and many tiles didn’t get painted.
I used two different machines for testing, that behaved the same.

no clues, - regards
Stefan

Hi Jerry,

got a good bit further:

First: the barrier of ~100MB seems to be a limit set in SQLiteCache.h where _maxSize is set to 100. When this size is reached a cleanup is run on the DiskCache-File, and then the filesize doesn’t grow and my routine goes on.
Second: The tilelog-file suggests, that there was still going on some mix up of the different zoom-levels when tiles get requested and stored so i looked into the code:

You suggested to delete all these tile->release calls. Looking into that Release function in TileCore.cpp, i changed it to:

################
long TileCore::Release()
{
InterlockedDecrement(&_refCount);

//if (_refCount < 0)
//    CallbackHelper::AssertionFailed("Invalid reference count for a tile.");

if (this->_refCount <= 0)
{
    delete this;
    return 0;
}
return _refCount;

}

################

to make it a bit more tolerant, the tile objects are still deleted, even when an obscure refCount occours.

I also uncommented one line in ITileLoader.cpp:
#########
// *******************************************************
// PreparePool()
// *******************************************************
CThreadPool* ITileLoader::PreparePool()
{
if (!InitPools())
return nullptr;

CThreadPool<ThreadWorker>* pool = _lastGeneration % 2 == 0 ? _pool : _pool2;

// pool->SetSize(m_globalSettings.GetTilesThreadPoolSize());

pool->SetTimeout(100000); // 100 seconds (lower rate limit may be set)

tilesLogger.WriteLine("Tiles requested; generation = %d", _lastGeneration);

CleanTasks();

CleanRequests();

return pool;

}
#########

Cause the poolsize is set in InitPools, - to set it again just one step later doesn’t make sense to me and can only cause problems.


I compiled the ocx with these changes made and ran the prefetch again, and hey, it worked!
-> The resulting sqlite db is now smaller as before, but the analysis with an external db tool revealed that really all tiles for the zoom levels are present. For me this looks as if the not released tiles might have been stored more than once.
-> after running the prefetch, the form still works and access the sqlite cache, - switching off the tile server doesn’t make the background disappear.

Tested this just once and on one machine, need to test this a little bit more.
But since yesterday i tested it without these changes at least 4 times on 2 different machines and got everytime the same problems, this looks promising.

Best regards
Stefan

just as a detail to this successfull run:
– your suggested changes to single-threading were not applied
– the ocx was running with a tilethreadpoolsize of 16
– i increased _maxSize to 200

it looks, as if it’s not the tile-Loader by itself that causes problems, but callbacks and events in general. one reason for the crashes when prefetching are most probably gone, cause there is no callback anymore because of the “Assertion failed”.
When i add an callback-Object even to my modified ocx, it starts crashing again, - while massivly paning around in the map, and i get even crashes while debugging when vs points me to the TilesLoaded-Event, - that i even do not catch in my current db.
When i stop doing an "debug.print " in the callback-classdefinition, and just copy the values to public variables and read those, it runs also more stable.