nimforum mirror - How to correctly use nim doc when a project has unimported modules

Isofruit (orginal) [2023-12-16T13:19:37+01:00] view original

I'm in the process of writing myself some docs for a lib I'm working on. That library is set up so that it solves a general problem, and then provides special utilities for easier usage with specific other libraries. This leads to a structure like this:


docs
    htmldocs
        <nim doc files should go in here>
src
    lib.nim
    lib
        module1.nim
        module2.nim
        integrations
            module3.nim
            module4.nim

The thing is that module3 and module4 are not imported by anyone outside of the integrations directory, but they do themselves import module1 and module2.

Therefore, when compiling the docs from lib.nim they will not be compiled with it (which makes sense).

I still want then in my compiled docs though, so I read through https://nim-lang.org/docs/docgen.html . My understanding thusly currently is that you should:

compile the index files for the entire project using --project --index:only lib.nim

fully compile the docs for module3 and module4 individually

compile the docs for the entire lib using --project

So these are my current commands that I execute:


nim doc --hints:off --project --index:only --outdir:./docs/htmldocs src/threadButler.nim
nim doc --hints:off --index:on --outdir:./docs/htmldocs/threadButler/integrations src/threadButler/integration/owlButler.nim
nim doc --hints:off --index:only --outdir:./docs/htmldocs src/threadButler/integration/owlCodegen.nim
nim doc --hints:off --project --index:on --outdir:./docs/htmldocs src/threadButler.nim

That leads to output I can look at, however I can't use importdoc inside of the modules in integrations as they'll blow up and links of the imports in the HTML files of owlButler.nim and owlCodegen.nim are off. The import paths e.g. owlCodegen.nim has are:

import ../codegen
import ../register
import ../validation

Which it turns into the urls:


# Should be <localhost>/docs/htmldocs/threadButler/codegen.html
<localhost>/docs/htmldocs/threadButler/integrations/codegen.html

# Should be<localhost>/docs/htmldocs/threadButler/register.html
<localhost>/docs/htmldocs/threadButler/integrations/register.html

# Should be <localhost>/docs/htmldocs/threadButler/validation.html
<localhost>/docs/htmldocs/threadButler/integrations/validation.html

I think I need to use docroot here, but I really do not wish to because this eventually also needs to run on a github workflow which will have different absolute paths to the project than I do.

What is one supposed to do here?

mratsim (orginal) [2023-12-17T19:08:48+01:00] view original

I think I need to use docroot here, but I really do not wish to because this eventually also needs to run on a github workflow which will have different absolute paths to the project than I do.

Welcome to nimble complete lunacy.

Anyway I may have stumbled on similar issues on Arraymancer doc, see my discussion with @haxscramper a while ago: https://github.com/haxscramper/haxdoc/issues/1

Isofruit (orginal) [2023-12-17T19:42:48+01:00] view original

Definitely cool to see that I'm not the first person that crashed into this issue.

What this looks like to me though is that basically while haxdoc would definitely be the place to contribute and I'm happy to trust hax when they say this is all only moderate difficulty (it's how I interpret their second post) and just a matter of implementation, this also means that the only solution seems to be to first work on said docgen tool ?

At least the wishlist and README of the project appear to point in that direction.

Isofruit (orginal) [2023-12-17T20:28:45+01:00] view original

I guess another good option could be to put more weight behind https://github.com/nim-lang/RFCs/issues/447

Once you have a JSON representation the rest is comparatively easily doable, getting that proper first JSON representation though is going to need a bit of work, given that e.g. jsondoc doesn't split out doc comments.

PMunch (orginal) [2023-12-17T21:06:19+01:00] view original

I second putting more work towards solving 447. With that we could have all sorts of interesting Nim documentation tools.

mratsim (orginal) [2023-12-18T10:32:34+01:00] view original

RST->Json has been implemented in the compiler by @haxscramper here https://github.com/haxscramper/nimskull/commit/df5de0b38f652f271a0767abcc4c9cad49a6d1ec

There is a large squashed PR with more work but can't review it: https://github.com/haxscramper/nimskull/commit/9b0687303b3515fa85d556ba278bc14a15fb1185

haxscramper (orginal) [2023-12-18T10:51:18+01:00] view original

RST parsing to the JSON format is something that can be done in a second part of the documentation generation. I think the best approach would be to split documentation processing into several phases:

Collect semantic information from the module, IR stores structure of types, names and so on, but the comment strings are not parsed and runnable examples are not evaluated, only the contextual information is collected

Run all the runnable examples in parallel, collect the output, store into the IR

Add id to every documentwble entry. For haxdoc I used a database-like approach with indices

Now there is an IR with all information you can get, start parsing rst and augmenting it with contextual information ("see" annotations should now have associated id etc.)

Now there is a pure data with everything resolved, start rendering, sorting and so on

Each stage might theoretically be done as a separate tool, part 5 can certainly be done as a separate tool. It might be a js page that renders everything on the fly by the way, highly dynamic sorting etc.

haxscramper (orginal) [2023-12-18T10:54:33+01:00] view original

Since haxdoc was implemented I got some ideas about more lightweight approach to how things could be done, but the main objective is to get the data format right, this is a purely data transform issue.

So the actionable steps I suggest would be to define a sensible data format, see how much work would be needed to fix nim doc to operate with it instead.

I'm writing from the phone now, so this is just a general direction, I will put more details later on today.

haxscramper (orginal) [2023-12-18T10:58:00+01:00] view original

But once there is a separate stage design like this it is possible to twist and turn it as needed, like integrating test runs into the nim doc (virtually no different with runnable examples yet overlooked), ingesting readme files from directories for more docs, operating on documentation in terms of nested directories, subprojects, import graphs, queries like "who returns X" and so on.

PMunch (orginal) [2023-12-18T14:47:01+01:00] view original

I think we shouldn't let perfect be the enemy of good here. What you propose would be great, but it's a considerably larger job. For now just making the compiler spit out JSON instead of HTML and write some small extra tools would improve things quite a bit and not require nearly as much work as that solution would involve.

haxscramper (orginal) [2023-12-18T15:31:41+01:00] view original

Yes, but what format of the JSON should be, how it would integrate runnable examples, at what stage you think is appropriate to resolve links to other pages (iirc nim doc supports this now, but I'm not sure), how to organize the resulting JSON so extra tools can work with this without having to import the compiler.

Those are questions that come in immediately and points 1,2, 4, 5 (collect the data, execute runnable examples, parse rst, write IR for other tools) would still be necessary. Point 3 can be dropped if name resolution is not needed, although fully qualified name can be used as an id.

haxscramper (orginal) [2023-12-18T15:37:30+01:00] view original

If runnable examples and other things are not important the simpler set of targets would still include the result JSON shema and a way for other tools to make sense of it. Some cleanup elements can still be omitted, like runnable examples triggered directly during doc generation, but in terms of design it is just mixing in phases 1,2,4 into one block.

PMunch (orginal) [2023-12-18T16:23:19+01:00] view original

Well runnable examples should run during doc creation like they do now, otherwise they serve no purpose. The format would be almost exactly the same as the current jsondoc, but adding it bits and pieces here and there that the HTML generation would require. Have a look at the Ratel documentation for example, it was generated from the current jsondoc output, so most of the stuff is already there.

haxscramper (orginal) [2023-12-18T18:31:52+01:00] view original

I checked the current nim jsondoc output and it seems there is some terminology difference, because for


type
  Enum* = enum
    value1
    value2
    value3 ## Some value
  
  Object* = object
    field1: int ## Documentation
    field2*: Enum


proc returnX*(): Enum =
  ## Documentat
  runnableExamples:
    echo 12
  
  value3

it gives


{
  "orig": "/tmp/a.nim",
  "nimble": "unknown",
  "moduleDescription": "",
  "entries": [
    {
      "name": "Enum",
      "type": "skType",
      "line": 2,
      "col": 8,
      "code": "Enum = enum\n  value1, value2, value3     ## Some value"
    },
    {
      "name": "Object",
      "type": "skType",
      "line": 7,
      "col": 10,
      "code": "Object = object\n  ## Documentation\n  field2*: Enum\n"
    },
    {
      "name": "returnX",
      "type": "skProc",
      "line": 12,
      "col": 0,
      "code": "proc returnX(): Enum {.raises: [], tags: [], forbids: [].}",
      "signature": {
        "return": "Enum",
        "pragmas": [
          "raises: []",
          "tags: []",
          "forbids: []"
        ]
      },
      "description": "Documentat"
    }
  ]
}

Which I will describe as "there is some plaintext code copypaste from your file, figure out the rest yourself". For example, if I want to have a documentation for enums formatted as a table -- good luck.

haxscramper (orginal) [2023-12-18T18:42:55+01:00] view original

I was talking about having some structure similar to the code below, where you construct the data that actually makes sense as a representation of the documentation format, and then you write it out in JSON form. Then, anyone is free to take the doc_ir.nim file and implement the rendering tool of their choice.

type
  DocQualName = object
    name: str
    module: str
    path: seq[str] ## Qualify full name from the library root, to avoid messing two
                   ## types in two modules with the same name, like `path1/mod.nim`
                   ## declaring `Type` and then `path2/mod.nim` declaring `Type`
                   ## as well.
  
  DocType = object
    fullName: DocQualName
    parameters: seq[DocType]
    
    # Not writing out the whole variant tree for number literals and so on
    # but the type should map type as seen by the user, like array, static[int]
    # etc. function types are omitted as well.
  
  DocField = object
    name: str
    exported: bool
    case isCase: bool
      of true:
        nested: seq[DocField]
      
      of false:
        typ: DocType
  
  DocObject = object
    field: seq[DocField]
    docs: string
    base: Optional[DocQualName]
    exported: bool
  
  DocArg = object
    name: str
    typ: DocType
  
  DocRunnableExample = object
    code: string
    nimCParams: seq[string]
  
  DocProc = object
    name: DocQualName
    arguments: seq[DocArg]
    returnType: Optional[DocType]
    documentation: string
    raises: seq[DocType]
    effects: seq[DocType]
  
  
  DocEnumField = object
    name: string
    documentation: string
    value: string # Or variant type?
  
  DocEnum = object
    name: DocQualName
    fields: seq[DocEnumField]
    exported: bool
  
  
  DocEntry = object
    case kind: DocEntryKind
      of dekEnum:
        denum: DocEnum
      
      of dekObject:
        dobject: DocObject
      
      of dekProc:
        dproc: DocProc
      
      of dekExample:
        ## Toplevel runnable example
        dexample: DocRunnableExample
      
      of dekComment:
        comment: string ## Toplevel comment
  
  DocModule = object
    imports: seq[DocQualName]
    toplevel: seq[DocEntry]

haxscramper (orginal) [2023-12-18T18:56:17+01:00] view original

nim doc and nim jsondoc at the moment seem to have duplicate logic for the implementation, nimdoc.genJsonItem and nimdoc.genItem do both write out result formats right away -- in JSON case it is code like

let
          paramName = $n[paramsPos][paramIdx][identIdx]
          paramType = $n[paramsPos][paramIdx][^2]
        if n[paramsPos][paramIdx][^1].kind != nkEmpty:
          let paramDefault = $n[paramsPos][paramIdx][^1]
          result.json["signature"]["arguments"].add %{"name": %paramName, "type": %paramType, "default": %paramDefault}
        else:
          result.json["signature"]["arguments"].add %{"name": %paramName, "type": %paramType}

to transfer from PNodes to JSON immediately and HTML semantic analysis directly jumps into configuration dependent logic to fetch things like doc.item.tocTable and splice them together

d.tocTable[k].mgetOrPut(cleanPlainSymbol, newSeq[TocItem]()).add TocItem(
    sortName: sortName,
    content: getConfigVar(d.conf, "doc.item.tocTable") % [
      "name", name, "header_plain", plainNameEsc,

where doc.item.tocTable is something user can configure, the code above is in genItem which is called via chain genItem -< generateDoc -< processNode(docgen2.nim) -< processPipeline(pipelines.nim) -< processPipelineModule ... anyway, the logic for the action here is


compute:
  compute:
    compute:
      compute:
        action

whereas I'm mainly advocating for solution where we can do


json_data: seq[NimModule] = compute:
                              compute:
                                compute:
                                  compute:

html_value = rendering(json_data)    # External user tools, custom rendering, anything

haxscramper (orginal) [2023-12-18T18:59:43+01:00] view original

Note the specific examples in the docgen code above are merely to illustrate the point; the DocModule example on the comment earlier is a starting point, there are of course some additions possible and so on.

haxscramper (orginal) [2023-12-18T19:06:17+01:00] view original

so tl;dr suggestion would be to throw out all the cruft in the docgen and make it collect list of modules, write it out in JSON format and call it a day. Or do this for one module, in this case the user can write out a collection of module3.json and module4.json files like the original post needed and some extra tool will stitch them together.

But the current json is by no means "most stuff is already there" in my estimate.

Mirror of forum.nim-lang.org

10773 :: How to correctly use nim doc when a project has unimported modules