Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

As of this writing, DSpace 7 is likely to require 4GB of memory at a bare minimum. However, with that little memory, you may quickly hit memory issues with any significant user activity or bulk uploading.  So, we recommend running DSpace with 6at least 8-8GB 12GB (or more for very large or very active sites).

...

  • 2GB of memory for the Frontend (UI) / Node.js.  Highly active sites will need more.
  • 1GB of memory for the Backend (REST API) / JVM / Tomcat. Highly active sites will need more.
  • 512MB of memory for PostgreSQL database. Highly active sites will need more.
  • 512MB of memory for Solr. Highly active sites may need more.
  • Extra memory may be required for command line scripts (which get kicked off in a separate JVM)

Keep in mind, because the frontend & backend can be run on separate servers, you can split this memory across two (or more) servers.  You can even choose to run PostgreSQL or Solr either alongside the backend or on their own dedicated server.

The DSpace frontend (UI) will often require several CPUs, especially if you wish to use "cluster mode" (see below) to better scale your application.  A smaller application may be able to use 4-6 CPU cores, while highly active sites may require additional CPU power. CPU is most often necessary for the frontend's Angular Serve Side Rendering (again see "cluster mode" notes below) and for any batch processing / command line scripts on backend.

Performance Tuning the Frontend (UI)

Use "cluster mode" of PM2 to avoid Node.js using a single CPU

If you are using PM2 to run the User Interface, you may want to start it using PM2's "Cluster Mode".  This allows Node.js applications to be scaled across multiple CPUs by using the Node.js cluster module.  See the PM2 Cluster Mode documentation at https://pm2.keymetrics.io/docs/usage/cluster-mode/

There are two ways to enable cluster mode. Choose one.

  1. First, is by adding the "exec_mode" and "instances" settings to your JSON configuration as follows.  You also may want to set the "max_memory_restart" option to avoid PM2 using too much memory.  These three settings are described in more detail below.  NOTE: make sure to start (or restart) your site to enable these settings (e.g. pm2 start dspace-ui.json)

    Code Block
    languagejs
    titledspace-ui.json
    {
        "apps": [
            {
               "name": "dspace-ui",
               "cwd": "/full/path/to/dspace-ui-deploy",
               "script": "dist/server/main.js",
               "instances": "max",
               "exec_mode": "cluster",
               "env": {
                  "NODE_ENV": "production"
               },
               "max_memory_restart": "500M"
            }
        ]
    }


    1. Setting "exec_mode" to "cluster" will enable cluster mode,

    2. The "instances" setting allows you to customize how many CPUs are available to PM2 ("max" = all CPUs. But you also can specify a number like "8" = 8 CPUs. )
    3. The "max_memory_restart" setting is optional but tells PM2 how much memory to allow per instance.  The example above has a maximum of 500MB.  If the number of 'instances' is 8, that would mean PM2 could use up to 8 x 500MB = 4GB of memory.  Therefore, you may wish to modify the values of "instances" and/or "max_memory_restart" to better control the memory available to PM2.
  2. Alternatively, you can use command line flags to specify the same settings described above.  The "-i" flag enables cluster mode and specifies the number of instances.  The "--max-memory-restart" flag limits the memory per instance.

    Code Block
    # Start the "dspace-ui" app. Cluster it across all available CPUs with a maximum memory of 500MB per CPU.
    # This command is equivalent to the example cluster settings in the "dspace-ui.json" file above.
    pm2 start dspace-ui.json -i max --max-memory-restart 500M


Give Node.js more memory

On machines with >2GB of memory available, Node will only use a maximum of 2GB of memory by default (see https://github.com/nodejs/node/issues/28202).  This 2GB of memory should be enough to build & run the User Interface, but it's possible that highly active sites may require 4GB or more.

...

Code Block
# Increase memory limit to 4GB (4096MB) by setting "max-old-space-size"
# in your NODE_OPTIONS environment variable
export NODE_OPTIONS=--max_-old_-space_-size=4096


Turn on (or increase) caching of Server-Side Rendered pages

As of DSpace 7.5, we now provide basic, in-memory caching of server-side rendered (SSR) pages.  Server-side rendering is used to pre-generate full HTML pages to pass back to users (primarily anonymous users and bots).  This is necessary for Search Engine Optimization (SEO) as some web crawlers cannot use Javascript.  It also can be used to immediately show the first HTML page to users while the Javascript app loads in the user's browser.

While server-side-rendering is highly recommended on all sites, it can result in Node.js having to pre-generate many HTML pages at once when a site has a large number of simultaneous users/bots.  This may cause Node.js to spend a lot of time processing server-side-rendered content, slowing down the entire site.

Therefore, DSpace provides some basic caching of server-side rendered pages, which allows the same pre-generated HTML to be sent to many users/bots at once & decreases the frequency of server-side rendering.

These settings are documented at User Interface Configuration: Cache Settings - Server Side Rendering (SSR)


Performance Tuning the Backend (REST API)

...

If you are seeing "java.lang.OutOfMemoryError: PermGen space" errors, this is a sure sign that Tomcat is running out PermGen Memory. (More info on PermGen Space: httphttps://blogsfrankkieviet.sunblogspot.com/fkieviet2006/entry10/classloader_-leaks_the_dreaded_java-dreaded-permgen-space.html)

To increase the amount of PermGen memory available to Tomcat (default=64MB), use either the JAVA_OPTS or CATALINA_OPTS environment variable, e.g:

...

Info
titleFor More PostgreSQL Tips

For more hints/tips with PostgreSQL configurations and performance tuning, see also:


Performance Tuning Solr

Solr has it's own detailed documentation with recommendations for "Taking Solr to Production".  We recommend following the recommendations from Solr, especially related to "Ulimit settings" (for Unix-based systems) and "Avoiding Swapping" (for Unix-based systems).  See the Solr documentation for more details.