Monday, June 23, 2014

How to Search & Modify All Hyperlinks in MS Word File inside Android Apps

This technical tip explains how developers can find and modify all hyperlinks in a Microsoft Word document inside Android Applications.  To find and modify hyperlinks it would be nice to have some sort of Hyperlink object with properties, but in the current version, there is no built-in functionality in Aspose.Words to deal with hyperlink fields. Hyperlinks in Microsoft Word documents are fields. A field consists of the field code and field result. In the current version of Aspose.Words, there is no single object that represents a field. Aspose.Words represents a field by a set of nodes: FieldStart, one or more Run nodes of the field code, FieldSeparator, one or more Run nodes of the field result and FieldEnd. While Aspose.Words does not have a high-level abstraction to represent fields and hyperlink fields in particular, all of the necessary low-level document elements and their properties are exposed and with a bit of coding you can implement quite sophisticated document manipulation features. This example shows how to create a simple class that represents a hyperlink in the document. Its constructor accepts a FieldStart object that must have FieldType.FieldHyperlink type. After you use the Hyperlink class, you can get or set its Target, Name, and IsLocal properties. Now it is easy to change targets and names of the hyperlinks throughout the document. In the example, all of the hyperlinks are changed to “http://aspose.com”.

The code below finds all hyperlinks in a Word document and changes their URL and display name.

package Examples;
import org.testng.annotations.Test;
import com.aspose.words.Document;
import com.aspose.words.NodeList;
import com.aspose.words.FieldStart;
import com.aspose.words.FieldType;
import com.aspose.words.NodeType;
import com.aspose.words.Run;
import com.aspose.words.Node;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * Shows how to replace hyperlinks in a Word document.
 */
public class ExReplaceHyperlinks extends ExBase
{
    /**
   {{ Finds all hyperlinks in a Word document and changes their URL and display name.
   {{/
    public void replaceHyperlinks() throws Exception
    {
        // Specify your document name here.
        Document doc = new Document(getMyDir() + "ReplaceHyperlinks.doc");
        // Hyperlinks in a Word documents are fields, select all field start nodes so we can find the hyperlinks.
        NodeList fieldStarts = doc.selectNodes("//FieldStart");
        for (FieldStart fieldStart : (Iterable<FieldStart>) fieldStarts)
        {
            if (fieldStart.getFieldType() == FieldType.FIELD_HYPERLINK)
            {
                // The field is a hyperlink field, use the "facade" class to help to deal with the field.
                Hyperlink hyperlink = new Hyperlink(fieldStart);
                // Some hyperlinks can be local (links to bookmarks inside the document), ignore these.
                if (hyperlink.isLocal())
                    continue;
                // The Hyperlink class allows to set the target URL and the display name
                // of the link easily by setting the properties.
                hyperlink.setTarget(NEW_URL);
                hyperlink.setName(NEW_NAME);
            }
        }
        doc.save(getMyDir() + "ReplaceHyperlinks Out.doc");
    }
    private static final String NEW_URL = "http://www.aspose.com";
    private static final String NEW_NAME = "Aspose - The .NET & Java Component Publisher";
}
/**
 * This "facade" class makes it easier to work with a hyperlink field in a Word document.
 *
 * A hyperlink is represented by a HYPERLINK field in a Word document. A field in Aspose.Words
 * consists of several nodes and it might be difficult to work with all those nodes directly.
 * Note this is a simple implementation and will work only if the hyperlink code and name
 * each consist of one Run only.
 *
 * [FieldStart][Run - field code][FieldSeparator][Run - field result][FieldEnd]
 *
 * The field code contains a string in one of these formats:
 * HYPERLINK "url"
 * HYPERLINK \l "bookmark name"
 *
 * The field result contains text that is displayed to the user.
 */
class Hyperlink
{
    Hyperlink(FieldStart fieldStart) throws Exception
    {
        if (fieldStart == null)
            throw new IllegalArgumentException("fieldStart");
        if (fieldStart.getFieldType() != FieldType.FIELD_HYPERLINK)
            throw new IllegalArgumentException("Field start type must be FieldHyperlink.");
        mFieldStart = fieldStart;
        // Find the field separator node.
        mFieldSeparator = findNextSibling(mFieldStart, NodeType.FIELD_SEPARATOR);
        if (mFieldSeparator == null)
            throw new IllegalStateException("Cannot find field separator.");
        // Find the field end node. Normally field end will always be found, but in the example document
        // there happens to be a paragraph break included in the hyperlink and this puts the field end
        // in the next paragraph. It will be much more complicated to handle fields which span several
        // paragraphs correctly, but in this case allowing field end to be null is enough for our purposes.
        mFieldEnd = findNextSibling(mFieldSeparator, NodeType.FIELD_END);
        // Field code looks something like [ HYPERLINK "http:\\www.myurl.com" ], but it can consist of several runs.
        String fieldCode = getTextSameParent(mFieldStart.getNextSibling(), mFieldSeparator);
        Matcher matcher = G_REGEX.matcher(fieldCode.trim());
        matcher.find();
        mIsLocal = (matcher.group(1) != null) && (matcher.group(1).length() > 0);    //The link is local if \l is present in the field code.
        mTarget = matcher.group(2).toString();
    }
    /**
   {{ Gets or sets the display name of the hyperlink.
   {{/
    String getName() throws Exception
    {
        return getTextSameParent(mFieldSeparator, mFieldEnd);
    }
    void setName(String value) throws Exception
    {
        // Hyperlink display name is stored in the field result which is a Run
        // node between field separator and field end.
        Run fieldResult = (Run)mFieldSeparator.getNextSibling();
        fieldResult.setText(value);
        // But sometimes the field result can consist of more than one run, delete these runs.
        removeSameParent(fieldResult.getNextSibling(), mFieldEnd);
    }
    /**
   {{ Gets or sets the target url or bookmark name of the hyperlink.
   {{/
    String getTarget() throws Exception
    {
        return mTarget;
    }
    void setTarget(String value) throws Exception
    {
        mTarget = value;
        updateFieldCode();
    }
    /**
   {{ True if the hyperlink's target is a bookmark inside the document. False if the hyperlink is a url.
   {{/
    boolean isLocal() throws Exception
    {
        return mIsLocal;
    }
    void isLocal(boolean value) throws Exception
    {
        mIsLocal = value;
        updateFieldCode();
    }
    private void updateFieldCode() throws Exception
    {
        // Field code is stored in a Run node between field start and field separator.
        Run fieldCode = (Run)mFieldStart.getNextSibling();
        fieldCode.setText(java.text.MessageFormat.format("HYPERLINK {0}\"{1}\"", ((mIsLocal) ? "\\l " : ""), mTarget));
        // But sometimes the field code can consist of more than one run, delete these runs.
        removeSameParent(fieldCode.getNextSibling(), mFieldSeparator);
    }
    /**
   {{ Goes through siblings starting from the start node until it finds a node of the specified type or null.
   {{/
    private static Node findNextSibling(Node startNode, int nodeType) throws Exception
    {
        for (Node node = startNode; node != null; node = node.getNextSibling())
        {
            if (node.getNodeType() == nodeType)
                return node;
        }
        return null;
    }
    /**
   {{ Retrieves text from start up to but not including the end node.
   {{/
    private static String getTextSameParent(Node startNode, Node endNode) throws Exception
    {
        if ((endNode != null) && (startNode.getParentNode() != endNode.getParentNode()))
            throw new IllegalArgumentException("Start and end nodes are expected to have the same parent.");
        StringBuilder builder = new StringBuilder();
        for (Node child = startNode; !child.equals(endNode); child = child.getNextSibling())
            builder.append(child.getText());
        return builder.toString();
    }
    /**
   {{ Removes nodes from start up to but not including the end node.
   {{ Start and end are assumed to have the same parent.
   {{/
    private static void removeSameParent(Node startNode, Node endNode) throws Exception
    {
        if ((endNode != null) && (startNode.getParentNode() != endNode.getParentNode()))
            throw new IllegalArgumentException("Start and end nodes are expected to have the same parent.");
        Node curChild = startNode;
        while ((curChild != null) && (curChild != endNode))
        {
            Node nextChild = curChild.getNextSibling();
            curChild.remove();
            curChild = nextChild;
        }
    }
    private final Node mFieldStart;
    private final Node mFieldSeparator;
    private final Node mFieldEnd;
    private boolean mIsLocal;
    private String mTarget;
    /**
   {{ RK I am notoriously bad at regexes. It seems I don't understand their way of thinking.
   {{/
    private static final Pattern G_REGEX = Pattern.compile(
        "\\S+" +            // one or more non spaces HYPERLINK or other word in other languages
        "\\s+" +            // one or more spaces
        "(?:\"\"\\s+)?" +    // non capturing optional "" and one or more spaces, found in one of the customers files.
        "(\\\\l\\s+)?" +    // optional \l flag followed by one or more spaces
        "\"" +                // one apostrophe
        "([How to Replace or Modify Hyperlinks^\"]+)" +        // one or more chars except apostrophe (hyperlink target)
        "\""                // one closing apostrophe
        );
}

More about Aspose.Words for Android
Aspose.Words for Android is a Java word processing component that enables developers to generate, modify, convert and render Word documents within their Android applications. Aspose.Words supports DOC, DOCX, OOXML, RTF, HTML, XHTML, MHTML, OpenDocument, ODT, PDF, XPS, EPUB & other formats. Other useful features include document creation, content and formatting manipulation, mail merge abilities, reporting features, platform independence, performance & scalability all with minimal learning curve.

More about Aspose.Words for Android

Sunday, June 15, 2014

Export/Restrict Hidden Worksheet to HTML & Improved Excel to PDF in Android

What’s new in this release?

Aspose development team is pleased to announce the new release of Aspose.Cells for Android v8.1.0. This new release contains many useful new features, bug fixes and other enhancements.  Aspose.Cells for Android API now provides get/set methods for the ExportHiddenWorksheet property exposed by the HtmlSaveOptions class. The Boolean property allows developers to choose whether to render hidden worksheets or not. The default value is true, meaning the hidden worksheet will be rendered as HTML; switching to false excludes hidden worksheets from the rendering process. More details on this topic can be found on the Prevent Exporting Hidden Worksheet Contents while Saving to HTML in the documentation section. Aspose.Cells tries to provide all the features that Microsoft Excel offers. Taking us one step further on this road, Aspose.Cells has exposed the get/set methods for the DisplayNullString and NullString properties for the PivotTable class to handle the pivot table option “For empty cells show” offered by Microsoft Excel. These properties are used to set the display option for the empty cells inside a pivot table to any specified string. Please check the detailed article on Setting Pivot Table Option – For empty cells show. Aspose.Cells for Android 8.1.0 has provided fixes for several important issues, such as rendering & manipulating charts, converting Microsoft Excel to PDF and HTML formats. This release includes plenty of improved features and bug fixes as listed below
  • Restrict the API to use the fonts directory specified using CellsHelper.setFontDir method
  • lines of diagram are now crisp/sharp
  • Get display color defined in custom number format
  • Some colors are shown just before column in one table is now fixed
  • Images missing is fixed in resultant PDF when spreadsheet is converted on Ubuntu
  • Gridlines and Font settings missing is resolved in the output HTML
  • Setting print quality of worksheets is now fixed
  • PrintCopies are preserved for XLS format but not for XLSX format is now fixed
  • Saving Pivottable as mht: exception is resolved
  • CellsException is resolved: Map size (0) must be >= 1
Other most recent bug fixes are also included in this release.

Overview: Aspose.Cells for Android

Aspose.Cells for Android is a MS Excel spreadsheet component that allows programmer to develop android applications for reading, writing & manipulate Excel spreadsheets (XLS, XLSX, XLSM, SpreadsheetML, CSV, tab delimited) and HTML file formats without needing to rely on Microsoft Excel. It supports robust formula calculation engine, pivot tables, VBA, workbook encryption, named ranges, custom charts, spreadsheet formatting, drawing objects like images, OLE objects & importing or creating charts.