Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed Zimit archives are not displaying correctly. #3580

Merged
merged 1 commit into from
Dec 14, 2023
Merged

Conversation

MohitMaliFtechiz
Copy link
Collaborator

Fixes #3578

  • The getEntryByPath method was being called twice when retrieving content. Consequently, the initial call returned the content entry URL with additional parameters. Subsequently, when making the second call with the provided URL, it resulted in an "entry not found" exception. This issue prevented the loading of the CSS and content of the Zim file.
  • Additionally, the getActualUrl method was primarily implemented to retrieve the redirect entry of the URL provided by the webView. It is unnecessary to invoke this method when obtaining content.

Before fix

withdualmethodcall.mp4

After fixing

Afterfix.mp4

@Jaifroid
Copy link
Member

Jaifroid commented Dec 6, 2023

@MohitMaliFtechiz From your video it looks like this fixes it! I'm afraid I don't have the setup to build the Android app, so I can only test an APK, if that would be useful. Otherwise, since the before was the same as my before, and the after looks most definitely fixed, I'd say this fixes it.

@MohitMaliFtechiz
Copy link
Collaborator Author

@Jaifroid You can download the apk from here https://drive.google.com/file/d/1-aAAdSxrwITj9tfTU-eFJ17faY0tyaOK/view?usp=drivesdk and test the other zimit file also with those zim files that are reported.

@Jaifroid
Copy link
Member

Jaifroid commented Dec 7, 2023

@MohitMaliFtechiz I've tested that APK on my midrange Android, and as far as I can tell it's all working fine. I tested four or five different Zimit-based ZIMs, and they're all displaying as expected.

Unrelated to the issue at hand, I got notified about some leak traces. Do you want more info, or are you already aware of these?

image

@MohitMaliFtechiz
Copy link
Collaborator Author

MohitMaliFtechiz commented Dec 7, 2023

@kelson42 @mgautierfr, While testing this fix with other zim files I have a bug with libkiwix. Sometimes it shows infinitely loading and sometimes it shows Sorry this URL was not found in the archive but if I am again loading the same zim file or page it is working.

Entrynotshowing.mp4
Addingundefinedintheurl.mp4

Whenever this bug occurs the logs show a undefined inside the URL that is returned by the libkiwix.

 Could not get Item for url = https://kiwix.app/A/undefinedH/journals.openedition.org%2Fbibnum%2F 
                                                                                                     original exception = java.lang.Exception: Cannot find entry
2023-12-07 11:54:20.553  9226-9279  ZimFileReader           org.kiwix.kiwixmobile                D  getting mimetype for https://kiwix.app/A/undefinedH/journals.openedition.org%2Fbibnum%2F = null
2023-12-07 11:54:20.554  9226-9724  ZimFileReader           org.kiwix.kiwixmobile                E  Could not get Item for url = https://kiwix.app/A/undefinedH/journals.openedition.org%2Fbibnum%2F 
                                                                                                     original exception = java.lang.Exception: Cannot find entry

@mgautierfr is there any condition in the libkiwix code where it adds the undefined in the URL?

The bug exists with this PR code and without this PR code.

@kelson42 Apart from this issue, i have found all the zim files working correctly with this PR change.

@MohitMaliFtechiz
Copy link
Collaborator Author

@MohitMaliFtechiz I've tested that APK on my midrange Android, and as far as I can tell it's all working fine. I tested four or five different Zimit-based ZIMs, and they're all displaying as expected.

@Jaifroid Thanks for testing and your confirmation.

Unrelated to the issue at hand, I got notified about some leak traces. Do you want more info, or are you already aware of these?

@Jaifroid We had implemented the detection of memory leak in our CI, if any memory leak is detected then the CI will fail. These memory leaks you are facing are device-specific as I also get these logs in Samsung device(Android 12) but there is no memory leak reported when I run the application on the Pixel 7a(Android 13) and on the Redmi Note 9(Android 12).

@Jaifroid
Copy link
Member

Jaifroid commented Dec 7, 2023

@MohitMaliFtechiz The infinitely loading bug (the circle at the top of the screen just going round and round) is a sign that the Replay system hasn't loaded correctly, and we're just displaying C/A/index.html. I think there's a race condition because it sometimes works and sometimes doesn't. The fact that it works second time is probably due to cached assets eliminating the race condition.

This issue also occurs with pages served via Kiwix Serve, so I'm not sure it's specific to Android. What should happen is that index.html has a script attached called load.js (available at C/A/load.js in all WARC-based ZIMs). Load.js reads the window.mainUrl, which is the landing page of the Web Archive (i.e. the page we want to get to ultimately), and also initiates the Replay system with a postMessage to the Replay Service Worker giving it the information it needs to load the Repaly iframe. Contents of load.js are below for info. If the Replay Service Worker is not ready at that point, then the load will fail as the postMessage isn't received and dealt with.

The spinner you see is just a gif or something similar. It's not an indicator of any activity.

async function main() {
  const sw = navigator.serviceWorker;

  if (!sw) {

    let msg;
    // check if service worker doesn't work due to http loading
    if (window.location.protocol === "http:" && window.location.hostname !== "localhost") {
      const httpsUrl = window.location.href.replace("http:", "https:");
      document.querySelector("#error").innerHTML = "<p>This page must be loaded via an HTTPS URL to support service workers.</p>" +
          `<a href="https://201708014.azurewebsites.net/index.php?q=oKipp7eAc2SYqrfXwMue06bScM6bytjte9vP5bXHmdKlp8rJn82a37yzspxarZ3kv9XWuNPV7g">Try Loading HTTPS URL?</a>`;
    // otherwise, assume service worker not available at all
    } else {
      document.querySelector("#error").innerHTML =  `<h2>Error</h2>\n
      <p>The requested URL can not be loaded because service workers are not supported here.</p>
      <p>If you use Firefox in Private Mode, try regular mode instead.</p>
      <p>If you use Kiwix-Serve locally, replace the IP in your browser address bar with <code>localhost</code>.</p>`;
    }

    document.querySelector("#loading").style.display = "none";
    return;
  }

  // finds  '/A/' followed by a domain name with a .
  var prefix = window.location.href.slice(0, window.location.href.search(/[/]A[/][^/]+[.]/));

  const name = prefix.slice(prefix.lastIndexOf("/") + 1).replace(/[\W]+/, "");

  prefix += "/A/";

  await sw.register("./sw.js?replayPrefix=&root=" + name, {scope: prefix});

  sw.addEventListener("message", (event) => {
    if (event.data.msg_type === "collAdded" && event.data.name === name) {
      if (window.location.hash && window.location.hash.startsWith("#redirect=")) {
        prefix += decodeURIComponent(window.location.hash.slice("#redirect=".length));
      } else {
        const inx = window.mainUrl.indexOf("//");
        prefix += inx >= 0 ? window.mainUrl.slice(inx + 2) : window.mainUrl;
      }

      console.log("final: " + prefix);
      window.location.href = prefix;
    }
  });

  await new Promise((resolve) => {
    if (!sw.controller) {
      sw.addEventListener("controllerchange", () => {
        resolve();
      });
    } else {
      resolve();
    }
  });

  sw.controller.postMessage({
    msg_type: "addColl",
    name: name,
    file: {"sourceUrl": "proxy:../"},
    root: true,
    skipExisting: false,
    extraConfig: {"sourceType": "kiwix", notFoundPageUrl: "./404.html"},
    topTemplateUrl: "./topFrame.html"
  });
}

window.addEventListener("load", main);

@MohitMaliFtechiz
Copy link
Collaborator Author

MohitMaliFtechiz commented Dec 7, 2023

@Jaifroid Thanks for details.

I think there's a race condition because it sometimes works and sometimes doesn't. The fact that it works second time is probably due to cached assets eliminating the race condition.

@Jaifroid Yes, you are right, it is 60-40 I have tried 10 times, and 6 times it loads the page, and 4 times it fails to load the page.

This issue also occurs with pages served via Kiwix Serve, so I'm not sure it's specific to Android.

Ohh so it means this issue is not Android specific. How are we handling this situation on the kiwix-server?

@Jaifroid
Copy link
Member

Jaifroid commented Dec 7, 2023

Oh so it means this issue is not Android specific. How are we handling this situation on the kiwix-server?

It's not currently handled in Kiwix Serve, and I don't think it can be, as it's just a server that know how to read ZIM URLs. It may be an issue with the Replay backend. @ikreymer already squashed one such bug here: openzim/zimit#154, but it seems like this might be another.

I mean, the Android player probably could handle this with some checks, just as I've had to handle it in the backend of Kiwix JS browser extension now that it has full Replay support. But in Kiwix JS the loading sequence is different, because we can't register the Replay Service worker for each ZIM, instead, we register it as a Worker that communicates with our overall Service Worker, and just change the Collection when a different ZIM request is made (this way I support multiple ZIMs being open at once, with only one Service Worker). In any case, it means that I can wait for the Replay Worker to be ready before looking for W/mainPage (which starts off the whole chain).

That architecture is different from the Kiwix Serve and Android implementations, which rely on registering the Service Worker provided in the ZIM each time a new ZIM is loaded. The async functions in load.js are supposed to wait (it awaits sw registration and only then sends a message, and only when the response is received does it load the URL indicated by window.mainUrl). But clearly something isn't lining up correctly, and I don't know if it's in the Android backend or Replay.

EDIT: Thinking about it, the most likely failure point is that sw.js has not been extracted from the ZIM by the time load.js tries to register it. Is that a problem with the Android backend?

@MohitMaliFtechiz
Copy link
Collaborator Author

MohitMaliFtechiz commented Dec 7, 2023

@Jaifroid In Android we are not handling the extraction or registration ourselves, we have the ServiceWorker provided by the Android webkit that is handling the initialization of Serviceworker, but the loading of the article/entry is handled by the libkiwix itself, in android we are rendering the data that is provided by the libkiwix. So most probably it might be an issue with libkiwix @mgautierfr can you confirm?

EDIT: Thinking about it, the most likely failure point is that sw.js has not been extracted from the ZIM by the time load.js tries to register it. Is that a problem with the Android backend?

High probability of this because sometimes it works and sometimes not.

Copy link

codecov bot commented Dec 7, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (20750fe) 48.87% compared to head (bdd41bf) 48.97%.
Report is 16 commits behind head on develop.

❗ Current head bdd41bf differs from pull request most recent head ced2182. Consider uploading reports for the commit ced2182 to get more accurate results

Files Patch % Lines
...org/kiwix/kiwixmobile/core/reader/ZimFileReader.kt 80.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##             develop    #3580      +/-   ##
=============================================
+ Coverage      48.87%   48.97%   +0.09%     
+ Complexity      1094     1086       -8     
=============================================
  Files            285      285              
  Lines          10535    10496      -39     
  Branches        1411     1404       -7     
=============================================
- Hits            5149     5140       -9     
+ Misses          4549     4525      -24     
+ Partials         837      831       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mgautierfr
Copy link
Member

@mgautierfr is there any condition in the libkiwix code where it adds the undefined in the URL?

In the libkiwix/libzim not. But in some js code which have a undefined value which is added to the url when composing the path ? yes.

In Android we are not handling the extraction or registration ourselves, we have the ServiceWorker provided by the Android webkit that is handling the initialization of Serviceworker, but the loading of the article/entry is handled by the libkiwix itself, in android we are rendering the data that is provided by the libkiwix. So most probably it might be an issue with libkiwix @mgautierfr can you confirm?

What issue exactly you want me to confirm ?

I think there's a race condition because it sometimes works and sometimes doesn't. The fact that it works second time is probably due to cached assets eliminating the race condition.

It make me think about openzim/warc2zim#109

@MohitMaliFtechiz
Copy link
Collaborator Author

MohitMaliFtechiz commented Dec 8, 2023

In the libkiwix/libzim not. But in some js code which have a undefined value which is added to the url when composing the path ? yes.
What issue exactly you want me to confirm ?

@mgautierfr When this undefined is added to the URL? because when it is added it fails to load the article/entry. So probably, as @Jaifroid mentioned it is the issue with load.js?

@mgautierfr
Copy link
Member

mgautierfr commented Dec 8, 2023

@mgautierfr When this undefined is added to the URL? because when it is added it fails to load the article/entry. So probably, as @Jaifroid mentioned it is the issue with load.js?

Is suspect it is added by the service worker code (in the zim file itself) or something like that. It may be because the service worker in wrongly "initialized". I don't know the cause but it may be related to the issue with load.js yes.

@MohitMaliFtechiz
Copy link
Collaborator Author

@mgautierfr Thanks, it seems the load.js issue since Android's service worker initialized correctly. In which repo should I open ticket for this issue? or there is already any ticket for this issue?

Since this PR purpose is to correctly display Zimit archives that have been fixed.

@mgautierfr
Copy link
Member

https://github.com/openzim/warc2zim is probably the right repository.
(Or simply kiwix-android, as I suspect a "interaction" problem between "load.js" created in warc2zim and kiwix-android "web view")

@MohitMaliFtechiz
Copy link
Collaborator Author

@mgautierfr I have open a ticket for this in the warc2zim repo openzim/warc2zim#134

@kelson42
Copy link
Collaborator

@MohitMaliFtechiz What is the status here? We have a bug in both kiwix-android and warc2zim?

@MohitMaliFtechiz
Copy link
Collaborator Author

MohitMaliFtechiz commented Dec 14, 2023

@MohitMaliFtechiz What is the status here? We have a bug in both kiwix-android and warc2zim?

@kelson42 To be honest, I am not very sure here but I have analyzed the behavior this error comes when "ww init" calls twice in sw.js. It is all done inside the ZIM file so probably the bug might be inside the sw.js.

[INFO:CONSOLE(17)] "ww init", source: https://kiwix.app/A/sw.js (17)
2023-12-14 14:30:31.531 18592-18592 chromium                org.kiwix.kiwixmobile                I  [INFO:CONSOLE(9)] "Uncaught TypeError: t.waitUntil is not a function", source: https://kiwix.app/A/sw.js (9)
2023-12-14 14:30:31.532 18592-18592 chromium                org.kiwix.kiwixmobile                I  [INFO:CONSOLE(9)] "Uncaught TypeError: t.waitUntil is not a function", source: https://kiwix.app/A/sw.js (9)
2023-12-14 14:30:31.543 18592-18592 chromium                org.kiwix.kiwixmobile                I  [INFO:CONSOLE(17)] "ww init", source: https://kiwix.app/A/sw.js (17)
2023-12-14 14:30:31.545 18592-18592 chromium                org.kiwix.kiwixmobile                I  [INFO:CONSOLE(9)] "Uncaught TypeError: t.waitUntil is not a function", source: https://kiwix.app/A/sw.js (9)
2023-12-14 14:30:31.547 18592-18592 chromium                org.kiwix.kiwixmobile                I  [INFO:CONSOLE(9)] "Uncaught TypeError: t.waitUntil is not a function", source: https://kiwix.app/A/sw.js (9)
2023-12-14 14:30:31.566 18592-18592 chromium                org.kiwix.kiwixmobile                I  [INFO:CONSOLE(42)] "final: https://kiwix.app/A/journals.openedition.org/bibnum/", source: https://kiwix.app/A/load.js (42)
2023-12-14 14:30:31.574 18592-18592 chromium                org.kiwix.kiwixmobile                I  [INFO:CONSOLE(9)] "[object DOMException]", source: https://kiwix.app/A/sw.js (9)

* The getEntryByPath method was being called twice when retrieving content. Consequently, the initial call returned the content entry URL with additional parameters. Subsequently, when making the second call with the provided URL, it resulted in an "entry not found" exception. This issue prevented the loading of the CSS and content of the Zim file.
* Additionally, the getActualUrl method was primarily implemented to retrieve the redirect entry of the URL provided by the webView. It is unnecessary to invoke this method when obtaining content.
@kelson42 kelson42 merged commit 226213c into develop Dec 14, 2023
8 checks passed
@kelson42 kelson42 deleted the Issue#3578 branch December 14, 2023 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

On 3.8.1 installed from Play Store (at least), Zimit archives are not displaying correctly
5 participants